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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 
lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 
"indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 
case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, for 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 
and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
30 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
3 5 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 

sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 

hybridization (SB H), and in some cases, sequences obtained from one or more public databases. 

The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 

5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQ ID NO: 1-1 786 and 3573-5358. The polypeptides sequences are 

designated SEQ ID NO; 2n (wherein n = 1 to 20). The nucleic acids and polypeptides are provided 

in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 

cytosine; G is guanine; T is thymine; and N is any of the four bases. In the amino acids provided in 

1 0 the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1 -1786 and 3 573-5358 under stringent hybridization 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 

15 specific domain or truncation ofthe peptides encoded by SEQ ID NO:l-1786 and 3573-5358 . A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequence of SEQ ID NO: 1-1786 and 3573-5358 or a degenerate variant or fragment thereof. The 
identifying sequence can be 100 base pairs in length. 

The nucleic acid sequences ofthe present invention also include the sequence information 

20 from the nucleic acid sequences of SEQ ID NO: 1-1 786 and 3573-5358 . The sequence information 
can be a segment of any one of SEQ IDNO:M786and 3573~5358that uniquely identifies or 
represents the sequence information of SEQ IDNO:l-1786and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 

25 a nucleic acid array. In one embodiment, segments of sequence informationis provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readable format 

This invention also includes the reverse or direct complement of any ofthe nucleic acid 

3 0 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readable media, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 

protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-1786 and 3573- 

5358 or novel segments or parts of the nucleic acids of the invention are used as primers in 

5 expression assays that are well known in the art . In a particularly preferred embodiment, the nucleic 

acid sequences of SEQ ID NO: 1-1786 and 3573-5358 or novel segments or parts of the nucleic 

acids provided herein are used in diagnostics for identifying expressed genes or, as well known in 

the art and exemplified by Vollrathet al., Science 258:52-59 (1992), as expressed sequence tags for 

physical mapping of the human genome. 

1 0 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO.1-1786 and 
3573-5358; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO: 1 -1786 and 3573-5358; and a polynucleotide comprising any of the nucleotide sequences of the 
mature protein coding sequences of SEQ ID NO: 1 - 1 786 and 3573-5358. The polynucleotides of the 

1 5 present invention also include, but are not limited to, a polynucleotide that hybridizes under 

stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set 
forth in SEQ ID NO:l -1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 

20 (e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 

polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 

25 full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in 
SEQ ID NO: 1-1786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 

30 equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 

Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 

hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 

5 the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 

10 protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 

15 or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 

20 expressed sequence tags for identifying expressed genes or, as well known in the art and 

exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 

25 of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 

30 which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutical^ acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 

3 5 expression or biological activity. 
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The present invention further relates to methods for detecting the presence of the 

polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 

utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 

identification of subjects exhibiting a predisposition to such conditions. The invention provides 

5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 

the sample with a compound that binds to and forms a complex with the polynucleotide of 

interest for a period sufficient to form the complex and under conditions sufficient to form a 

complex and detecting the complex such that if a complex is detected, the polynucleotide of 

interest is detected. The invention also provides a method for detecting the polypeptides of the 

10 invention in a sample comprising contacting the sample with a compound that binds to and forms 

a complex with the polypeptide under conditions and for a period sufficient to form the complex 

and detecting the formation of the complex such that if a complex is formed, the polypeptide is 

detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
1 5 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
20 (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
30 identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
35 modulate the overall activity of the target gene products. Compounds and other substances can 
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effect such modulation either on the level of target gene/protein expression or target protein 

activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
5 polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are useful for a variety of applications, as described 
herein, including use in arrays for detection. 

10 

4. DETAILED DESCRIPTION OF THE INVENTION 
4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
1 5 "an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
20 Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
25 enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
30 total complementarity exists between the single stranded molecules. The degree of 

complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
35 stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 

6 
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and continuous source of germ cells for the production of gametes. The term "primordial germ 

cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 

from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 

differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 

5 are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 

not only populate the germ line and give rise to a plurality of terminally differentiated cells that 

comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 

modulates the expression of an operably linked ORF or another EMF. 

10 As used herein, a sequence is said to "modulate the expression of an operably linked 

sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

1 5 The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 

"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the 

20 sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 

25 acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 

30 more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 17 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 

35 preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
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nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 

be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 

procedures to identify or amplify identical or related parts of mRNA or DNA molecules, A 

fragment or segment may uniquely identify each polynucleotide sequence of the present 

5 invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 

IDNOs:l-20. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). They may 

1 0 be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 

1 5 entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO: 1 -1 786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO:l-1786 and 3573-5358 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO:l- 

20 1786 and 3573-5358. One such segment can be a twenty-mer nucleic acid sequence because the 
probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the human 
genome, there are three billion base pairs in one set of chromosomes. Because 4 20 possible 
twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 

25 matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used. The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 

30 be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 
with a single mismatch is calculated by multiplying the probability for a full match ( 1 *4 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 

35 detected in a human genome is approximately one in five. 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 

amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
5 sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
1 0 differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
1 5 acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 1 7 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 100 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
20 length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 
25 The term "translated protein coding portion" means a sequence which encodes for the full 

length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
30 produced by processing in the cell which removes any leader/signal sequence. The mature 

protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein v may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 
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The term "derivative" refers to polypeptides chemically modified by such techniques as 

ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 

attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 

substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 

5 in human proteins. 

The term "variant" (or "analog") refers to any polypeptide differing from naturally 

occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 

recombinant DNA techniques. Guidance in determining which amino acid residues may be 

replaced, added or deleted without abolishing activities of interest, may be found by comparing 

1 0 the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 

15 substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 

20 affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 

25 nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 

30 "deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 

amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 
Alternatively, where alteration of function is desired, insertions, deletions or 

35 non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 

the polypeptides of the invention. For example, such alterations may change polypeptide 

characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 

rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 

5 for expression, scale up and the like in the host cells chosen for expression. For example, 

cysteine residues can be deleted or substituted with another amino acid residue in order to 

eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 

nucleic acid or polypeptide is present in the substantial absence of other biological 

10 macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 

polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

1 5 The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 

at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 

20 polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 

25 defines a polypeptide or protein essentially free of native endogenous substances and 

unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

30 The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 

or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 

35 appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 

extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 

protein is expressed without a leader or transport sequence, it may include an amino terminal 

methionine residue. This residue may or may not be subsequently cleaved from the expressed 

5 recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 

a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 

transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 

express heterologous polypeptides or proteins upon induction of the regulatory elements linked 

10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 

1 5 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. 

20 "Secreted" proteins also include without limitation proteins that are transported across the 
membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P. A. and 
Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 

25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

30 The term "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 > 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1 X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 

35 described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 

hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 

14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 

60°C (for 23-base oligonucleotides). 

5 As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 

sequences, for example a mutant sequence, that varies from a reference sequence by one or more 

substitutions, deletions, or additions, the net effect of which does not result in an adverse 

functional dissimilarity between the reference and subject sequences. Typically, such a 

substantially equivalent sequence varies from one of those listed herein by no more than about 

10 35% (i.e. , the number of individual residue substitutions, additions, and/or deletions in a 

substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 

1 5 listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% {sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 

20 sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 

25 preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent. For the purposes of determining 
equivalence, truncation of the mature sequence (e.g., via a mutation which creates a spurious 
stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun 

30 Hein method (Hein, J. (1990) Methods Enzymol. 1 83:626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 

35 DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
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term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 

or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 

of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 

5 which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 

using known UMFs as a target sequence or target motif with the computer-based systems 

described below. The presence and activity of a UMF can be confirmed by attaching the 

suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 

with an appropriate host under appropriate conditions and the uptake of the marker sequence is 

10 determined. As described above, a UMF will increase the frequency of uptake of a linked 

marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 



15 4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ ID NO:1787-3572 and 5359-7144; and a polynucleotide 

20 comprising the nucleotide sequence encoding the mature protein coding sequence of the 

polypeptides of any one of SEQ ID NO:1787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO: 1- 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acid sequences 

25 set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 

polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO:1787-3572 and 5359-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 

30 receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 
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The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

5 The present invention also provides genes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 

1 0 be obtained using methods known in the art For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO:l-1786 and 3573-5358 can be obtained 
by screening appropriate cDNA or genomic DNA libraries under suitable hybridization conditions 
using any of the polynucleotides of SEQ ID NO: 1 -1 786 and 3573-5358 or a portion thereof as a 
probe. Alternatively, the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 may be used as the 

1 5 basis for suitable primer(s) that allow identification and/or amplification of genes in appropriate 
genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 

20 representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-1 786 and 3573-5358, or complements thereof, which fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e. specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 
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the same family of genes or can differentiate human genes from genes of other species, and are 
preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences, but also include allelic and species variations thereof. Allelic and species 
5 variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1 - 1 786 
and 3 573-53 58, a representative fragment thereof, or a nucleotide sequence at least 90% identical, 
preferably 95% identical, to SEQ ID NO:l-1786 and 3573-5358 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention includes 
nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed 

1 0 herein. In other words, in the coding region of an ORF, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ ID NO:l-1786 and 3573-5358, can be obtained by searching a database using an 
algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool 

1 5 is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et al. J. Mol. Biol. 21:403-410(1990)). Alternatively a FASTA version 3 search 
against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 

20 suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

' The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 

30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 

35 will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 
choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. Amino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
5 insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 
preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host cells and 

10 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 
In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

1 5 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2: 183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs 

slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

25 gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 

to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 

domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 

5 polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 

synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 

to those of skill in the art and can include, for example, methods for determining hybridization 

conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 

10 protein coding sequences corresponding to any one of SEQ ID NO:l-1786 and 3573-5358, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 
the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 
Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 

1 5 nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 

20 invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 

25 organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO: 1 -1 786 and 3573-5358 or a fragment 
thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 

30 which a nucleic acid having any of the nucleotide sequences of SEQ ID NO.i-1 786 and 3573- 
5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 
comprising one of the ORFs of the present invention, the vector may further comprise regulatory 
sequences, including for example, a promoter, operably linked to the ORF. Large numbers of 
suitable vectors and promoters are known to those of skill in the art and are commercially 

35 available for generating the recombinant constructs of the present invention. The following 
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vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, 

pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); P Trc99A, pKK223-3, pKK233-3, 

pDR540, pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 

pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

5 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et aL, 

Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 

suitable expression control sequences are known in the art. General methods of expressing 

recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 

10 Enzymology 185, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

15 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

20 Generally, recombinant expression vectors will include origins of replication and selectable 

markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli 
and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphogIycerate kinase (PGK), a-factor, acid 

25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

30 characteristics, e.g. , stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 

35 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
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transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 
within the gensmPseudomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
5 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 

1 0 sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means {e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

15 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et al, Nat Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 

20 sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 

4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
25 are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ IDNO:l-1786 and 3573-5358, or fragments, analogs or derivatives thereof. 
An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic 
30 acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO:1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nucleic 
acid sequence of SEQ ID NO:l-1786 and 3573-5358 are additionally provided. 



20 



WO 01/53312 PCT/US00/34263 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
5 "noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5 f and 3 ! sequences which flank the coding region that are not 
translated into amino acids (le., also referred to as 5' and 3 1 untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO:l-1786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 
10 to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of a mRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 1 0, 
15 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 
20 physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
25 2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
inosine, N6-isopentenyladenine 3 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 

2- methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine a 
7-methylguanine, 5-methylaminomethyluracil, 5 -methoxy aminomethy 1-2-thiouracil, 
beta-D-mannosylqueosine, S'-methoxycarboxymethyluracil, 5-methoxyuracil, 

30 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 

3 5 nucleic acid has been subcloned in an antisense orientation (/. e. , RNA transcribed from the 
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inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 

described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 

subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 

5 genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 

protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 

conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 

an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 

the major groove of the double helix. An example of a route of administration of antisense 

10 nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 

1 5 receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

20 oc-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

double-stranded hybrids with complementary RNA in which, contrary to the usual P-units, the 
strands run parallel to each other (Gaultier et al (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Inoue et al. 
(1987) Nucleic Acids Res 15: 613 1-6148) or a chimeric RNA -DNA analogue (Inoue et al (1987) 

25 FEBS Lett2l5: 327-330). 



4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
30 single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (ie., SEQ ID NO:l- 
35 1786 and 3573-5358). For example, a derivative of a Tetrahymena L-19 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to the 

nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., Cech et al U.S. Pat. 

No. 4,987,071 ; and Cech et al U.S. Pat. No. 5,1 16,742. Alternatively, SECX mRNA can be 

used to select a catalytic RNA having a specific ribomiclease activity from a pool of RNA 

molecules. See, Bartel et al, (1993) Sd<?/2*? 261:141 1-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region {e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. etal (1992) Ann. N.Y.Acad. Sci. 660:27-36; and 
Maher(1992) Bioassays 14: 807-15. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al (1996) Bioorg Med 
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al (1996) above; 
Peny-O'Keefe et al (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et al (1996), above; Perry-O'Keefe (1996), 
above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
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portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 

using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 

the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 

can be performed as described in Hyrup (1996) above and Finn et al (1996) Nucl Acids Res 24: 

5 3357-63. For example, a DNA chain can be synthesized on a solid support using standard 

phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 

5-(4-methoxytrityl)amino-5 -deoxy-thymidine phosphoramidite, can be used between the PNA 

and the 5' end of DNA (Mag et al (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then 

coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 

10 DNA segment'(Finn et al (1996) above). Alternatively, chimeric molecules can be synthesized 

with a 5 ! DNA segment and a 3' PNA segment. See, Petersen et al (1975) Bioorg Med Chem 

Lett5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 

15 cell membrane (see, e.g., Letsinger et al, 1989, Proc. Natl Acad. Sci. U.S.A. 86:6553-6556; 
Lemaitre et al, 1987, Proc. Natl Acad Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g. , Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 

20 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 

peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 

4.5 HOSTS 

25 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

30 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

35 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
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the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 
5 DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

1 0 The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1986)). The host cells containing one of the 

1 5 polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 

20 COS ceDs, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 

25 RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 

expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 

30 protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 

35 from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
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HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5 r flanking 
nontranscribed sequences. DNA sequences derived from the S V40 viral genome, for example, 
5 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
1 0 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 

1 5 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida* or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 

20 may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

25 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

30 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 
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protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
5 enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 

10 sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 

1 5 selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 

20 phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et aL; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 

25 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 



4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
30 comprising: the amino acid sequences set forth as any one of SEQ ID NO:1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:l- 
1786 and 3573-5358 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by : (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
35 NO:l-1786 and 3573-5358 or (b) polynucleotides encoding any one of the amino acid sequences 
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set forth as SEQ ID NO:1787-3572 and 5359-7144 or (c) polynucleotides that hybridize to the 

complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 

The invention also provides biologically active or immunologically active variants of any of the 

amino acid sequences set forth as SEQ ID NO: 1787-3572 and 5359-7144 or the corresponding 

5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 

65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 

about 90%, typically at least about 95%, more typically at least about 98%, or most typically at . 

least about 99% ammo acid identity) that retain biological activity. Polypeptides encoded by 

allelic variants may have a similar, increased, or decreased activity compared to polypeptides 

10 comprising SEQ ID NO:1787-3572 and 5359-7144. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 

15 Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 

20 sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 

25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 

30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the art can be utilized to obtain any one of the 

isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 

sequence can be synthesized using commercially available peptide synthesizers. The 

synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 

5 structural and/or conformational characteristics with proteins may possess biological properties 

in common therewith, including protein activity. This technique is particularly useful in 

producing small peptides and fragments of larger polypeptides. Fragments are useful, for 

example, in generating antibodies against the native polypeptide. Thus, they may be employed 

as biologically active or immunological substitutes for natural, purified proteins in screening of 

10 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

1 5 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer- Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 

35 Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 

10 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:1787-3572 and 5359-7144. 

1 5 The protein of the invention may also be expressed as a product of transgenic animals, 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 

20 deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 

25 molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 

30 systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATREX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 

35 retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
5 an insect expression system. Materials and methods for baculovirus/insect cell expression 

systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 

10 invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 

15 of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

20 Alternatively, the protein of the invention may also be expressed in a form which will 

facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, 

25 respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 

30 aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments, 

as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 

Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 

modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 

5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 

may exhibit improved properties such as activity and/or stability. Examples of moieties which 

may be fused to the polypeptide or an analog include, for example, targeting moieties which 

provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 

antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 

10 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 

fused to the polypeptide include therapeutic agents which are used for treatment, for example, 

immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 

steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 

alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 

20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 21 5:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 

25 Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
105-3 1 (1 982), incorporated herein by reference). The BLAST programs are publicly available 

30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 

correspond to all or a portion of a protein according to the invention. In one embodiment, a 

fusion protein comprises at least one biologically active portion of a protein according to the 

invention. In another embodiment, a fusion protein comprises at least two biologically active 

5 portions of a protein according to the invention. Within the fusion protein, the term "operatively 

linked 1 ' is intended to indicate that the polypeptide according to the invention and the other 

polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-tenninus or 

C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

10 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

1 5 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e,g, cancer as well as modulating {e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in-frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

1 0 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

15 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

3 0 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
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the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/12650, PCT International PubUcationNo. WO 92/20808, and PCT 
InternationalPublicationNo. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
protein produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitated by the use of one or 

more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 

of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 

targeting event may also be facilitated by the use of one or more marker genes exhibiting the 

5 property of negative selection, such that the negatively selectable marker is linked to the exogenous 

DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 

such that a correct homologous recombination event with sequences in the host cell genome does 

not result in the stable integration of the negatively selectable marker. Markers useful for this 

purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 

1 0 xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 
U.S. PatentNo. 5,578,461 to Sherwinet al.; International Application No. PCT/US92/09627 
(WO93/09222)by Seldenet al.; and International Application No. PCT/US90/06436 

15 (WO91/06667)by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

30 Publication No. W094/28 122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

35 replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

1 0 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

1 5 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

20 Publication No. W094/28122, incorporated herein by reference. 

Transgenic aniriials can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 

25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 

1 0 indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 

15 or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al, Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 

35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
5 receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 
10 Any or all of these research utilities are capable of being developed into reagent grade or 

kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
15 and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
20 sources or supplements. Such uses include without limitation use as a protein or amino acid 

supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
25 polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 



4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

30 A polypeptide of the present invention may exhibit activity relating to cytokine, cell 

proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 

35 or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 

39 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 

invention is evidenced by any one of a number of routine factor dependent cell proliferation 

assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 

MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, T1165, HT2, CTLL2, TF-1, Mo7e, CMK, 

5 HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 

Assays for T-cell or thymocyte proliferation include without limitation those described 

in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 

M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 

In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 

1 0 Humans); Takai et ah, J. Immunol. 1 37:3494-3500, 1 986; Bertagnolli et al., J. Immunol. 

145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et al, I. Immunol. 149:3778-3783, 1992; Bowman et al., I. Immunol 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 

15 Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 

20 include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et al., I Exp. Med. 173:1205-121 1, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 

25 and human interleukin 6— Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 11-Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E> Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 

30 9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 

35 Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiiey-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1 980; Weinberger et al., Eur. J. Immun. 1 1 :405-4 1 1 , 1 98 1 ; Takai et al., J. Immunol 
5 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 

10 cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 

1 5 large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

20 for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 

25 3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 

30 these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 

35 with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

10 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 

15 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 

20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 

25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 

30 Academic Press (1997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 

35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

10 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 

15 to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

20 various platelet disorders such as thrombocytopenia, and generally for use in place of or 

complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 
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Assays for stem cell survived and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
5 Proc. Natl Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
10 Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

15 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of burns, incisions and ulcers. 

20 A polypeptide of the present invention which induces cartilage and/or bone growth in 

circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 

25 artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 

30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 

present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 

other tissue formation in circumstances where such tissue is not normally formed, has application 

in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 

5 humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 

protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 

use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 

defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 

a composition of the present invention contributes to the repair of congenital, trauma induced, or 

10 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 

15 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

30 Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 

35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
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endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 

desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 

to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
5 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above' from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 
1 0 Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 
15 Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84 (1978). 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 

25 severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 

35 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
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rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
5 reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens- Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 

10 (particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 

1 5 1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 

20 immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 

25 in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 

30 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 

35 followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
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composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
5 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 

1 0 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al, Science 257:789-792 (1992) and Turka et al, Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 

15 compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 

20 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 

25 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 

30 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
35 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
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Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

1 5 MHC class I alpha chain protein and microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

30 - Wiley-Interscience (Chapter 3 , In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al, J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998; Bertagnolli et al., 

35 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 
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Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
5 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
10 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
15 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al, Journal of Experimental Medicine 1 82:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
20 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
25 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
30 include, without limitation, those described in: Antica et al, Blood 84:1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVIN/INHIBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
5 release of follicle stimulating .hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
10 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
1 5 animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et al., Nature 
20 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095,1986. 



4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemptactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
30 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
35 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
10 M. Kruisbeek, D, H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

15 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 

20 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

25 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

30 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
35 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
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may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 

Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 

condition. Identification of single nucleotide polymorphisms associated with cancer or a 

predisposition to cancer may also be useful for diagnosis or prognosis. 

5 Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 

inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 

and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 

compositions of the invention may be effective in adult and pediatric oncology including in solid 

phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 

1 0 cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

15 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 

20 nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 

25 administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

30 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 

35 with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, 
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Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (VI 6-21 3), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 
5 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HCi, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 

1 0 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

1 5 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

20 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 921-30 

(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et al., 

25 Clin. Exp. Metastasis, 1 7:423-9 (1 999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 



4J0.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
35 integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
5 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
10 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 
Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1 145-1156, 1988; 
Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
1 5 By way of example, the polypeptides of the invention may be used as a receptor for a 

ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 
overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
20 partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 
Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 
Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
25 carbon- 14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

30 This invention is particularly useful for screening chemical compounds by using the 

novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 

35 nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
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transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 

be used for standard binding assays. One may measure, for example, the formation of complexes 

between polypeptides of the invention or fragments and the agent being tested or examine the 

diminution in complex formation between the novel polypeptides and an appropriate cell line, 

5 which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 

increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 

organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

1 0 Chemical libraries may be readily synthesized or purchased from a number of 

commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 

1 5 screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 282:63-68 (1 998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 

20 organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 

25 Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Dorner et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 

30 polypeptide of the invention. The molecules identified in the binding assay are then tested for 

antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 

35 cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 



5 



4.10.14 



ASSAY FOR RECEPTOR ACTIVITY 



The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

10 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 

1 5 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 



4.10.15 



ANTI-INFLAMMATORY ACTIVITY 
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Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
5 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 

10 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1. Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

15 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 

20 intrauterine infections. 



4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
25 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblasts, promyelocytic, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

30 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
35 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
10 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

15 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

25 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

30 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

35 system disorder may be selected by testing for biological activity in promoting the survival or 
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differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

10 forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 

1 5 assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 

20 well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 

25 (Charcot-Marie-Tooth Disease) . 



4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
30 including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
35 subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
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elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
5 reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
10 in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
15 polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
20 polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modified 
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nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
5 also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
10 arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund r s adjuvant (CFA). The 
15 route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 

mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
20 test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

25 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
30 include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
35 disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
5 condition and response of the individual patient. Typically, the amount of polypeptide 

administered per dose will be in the range of about 0.01|ig/kg to 100 mg/kg of body weight, with 
the preferred dose being about O.ljag/kg to 10 mg/kg of patient body weight For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutical^ acceptable parenteral vehicle. Such vehicles are well known in the art 
10 and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 

15 4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 

20 to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable 11 means a non-toxic material that does not interfere with the 

25 effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 
M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, EL-9, IL-10, IL-11, IL-12, 
IL-13, IL-14, IL-15, IFN, TNF0, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 

30 factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
factor (PDGF), transforming growth factors (TGF-a and TGF-P), insulin-like growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 

the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
5 invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 

10 IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a proteiQ of the invention in such multimeric or complexed form. 
As an alternative to being included in a pharmaceutical composition of the invention 

1 5 including a first protein, a second protein or a therapeutic agent may be concurrently 

administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 

20 edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 

25 combination, a therapeutically effective dose refers to combined amounts of the active 

ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 

30 a mammal having a condition to be treated. Protein or other active ingredient of the present 

invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 

35 administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
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factor(s), thrombolytic or antithrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or antithrombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

10 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

1 5 Alternately, one may administer the compound in a local rather than systemic manner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician, to provide maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceutically. These pharmaceutical compositions may be 

65 
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manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
5 invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 
the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 

10 other active ingredient of the present invention. When administered in liquid form, a liquid 
earner such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 

15 When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 

20 other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 

25 present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 

30 preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
35 active compounds with pharmaceutically acceptable carriers well known in the art. Such carriers 
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enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
5 suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 

1 0 may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 

1 5 added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 

20 lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 

optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 

25 tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g. , 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 

30 other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 

providing a valve to deliver a metered amount. Capsules and cartridges of, e.g. , gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 

35 injection may be presented in unit dosage form, e.g. , in ampules or in multi-dose containers, with 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
5 the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 

10 dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 

1 5 retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 

20 materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 

25 of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1 : 1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 

30 without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 

co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 

35 hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

10 The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutical^ 

1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

20 The pharmaceutical composition of the invention may be in the form of a complex of the 

protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 

25 presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone ot with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
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lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
5 herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 

10 ingredient of the present invention with which to treat each individual patient. Initially, the 

attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 

15 various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 (ig to about 100 mg (preferably about 0.1 ng to about 10 mg, more preferably 
about 0.1 |ig to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 

20 topically, systematically, or locally as an implant or device. When administered, the therapeutic, 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 

25 active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 

30 cartilage damage, providing a structure for the developing bone and cartilage and optimally 

capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 

35 compositions will define the appropriate formulation. Potential matrices for the compositions 
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may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 

hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 

are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 

matrices are comprised of pure proteins or extracellular matrix components. Other potential 

5 matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 

aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 

mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 

tricalcium phosphate. The bioceramics may be altered in composition, such as in 

calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 

1 0 biodegradability . Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 

In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 

cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 

the matrix. 

15 A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 

(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 

20 polyethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and polyvinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 

25 protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-cc and TGF-P), and 

30 insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 

35 regeneration will be determined by the attending physician considering various factors which 
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modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
5 with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

10 Polynucleotides of the present invention can also be used for gene therapy. Such 

polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 

1 5 proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 

20 compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 

25 the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve, a circulating 
concentration range that includes the IC50 as determined in cell culture (i.e., the concentration of 

30 the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 

35 cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 
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population) and the ED 5 o (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD50 and ED 50 . Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
5 range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975, in "The 

10 Pharmacological Basis of Therapeutics", Ch. 1 p.L Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 

1 5 bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 

20 related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 [xg/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0. 1 \xg/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 

25 intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 



30 4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 



4.13 ANTIBODIES 

5 Also included in the invention are antibodies to proteins, or fragments of proteins of the 

invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F ab > F a b' and 

10 fragments, and an F a b expression library. In general, an antibody molecule obtained from 

humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGi, IgG2, and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 

15 subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 

20 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

antigenic peptide fragment comprises at least 6 amino, acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ID NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 

25 Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 

30 antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 

35 may be generated by any method well known in the art, including, for example, the Kyte 
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Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1981, Proa Nat Acad Set USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
Mol Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
5 fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
10 monoclonal antibodies directed against a protein of the invention, or against derivatives, 1 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

1 5 5.13,1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 

20 protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 

25 adjuvant. Various adjuvants used to increase the immunological response include, but are not 
limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 

30 adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 

35 fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 



WO 01/53312 PCT/US00/34263 
target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

5 

5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 

1 0 gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 

15 described by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 

20 protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles and Practice , Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 

25 transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 

30 the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 

35 can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol., 133:3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications. Marcel Dekker, Inc., New York, (1987) pp. 
5 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
1 0 enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochem.. 107:220 (1 980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

15 After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI- 1 640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 

20 medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 
example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 

25 invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 

30 myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368. 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 

35 coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
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polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5 5.13*2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further comprise 
humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 

10 immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab')2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature. 321:522-525 (1986); Riechmann et al., Nature. 332:323-327 (1988); Verhoeyen et al., 

15 Science. 239: 1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 

corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 

20 humanized antibody will comprise substantially all of at least one, and typically two, variable 

domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 

25 immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Biol.. 
2:593-596(1992)). 

5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
30 sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: MONOCLONAL 
35 Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
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antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 
5 In addition, human antibodies can also be produced using additional techniques, 

including phage display libraries (Hoogenboom and Winter, J. Mol. Biol., 227:381 (1991); 
Marks et al, J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 

10 challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 

15 Biotechnology 14, 845-51 (1996V): Neuberger (Nature Biotechnology 14, 826 (1996)); and 
Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 

20 endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 
have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 

25 transgenic animals containing fewer than the full complement of the modifications. The 

preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 

30 polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
5 locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
10 U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 

nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 
15 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

20 5.13*4 Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 

25 monoclonal F a b fragments with the desired specificity for a protein or derivatives, fragments, 

analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F( a b')2 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F a b fragment generated 
by reducing the disulfide bridges of an F( ab »)2 fragment; (iii) an F^ fragment generated by the 

30 treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
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binding specificities is for an antigenic protein of the invention. The second binding target is any 

other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit 

Methods for making bispecific antibodies are known in the art. Traditionally, the 

recombinant production of bispecific antibodies is based on the co-expression of two 

5 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 

specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 

assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 

potential mixture of ten different antibody molecules, of which only one has the correct 

bispecific structure. The purification of the correct molecule is usually accomplished by affinity 

10 chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker etaL, 1991 EMBOJ. y 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 

1 5 the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 

20 aL Methods in Enzvmology. 121 :210 (1986). 

According to another approach described in WO 96/27011, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 

25 chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

30 Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 

F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 

35 fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 



WO 01/53312 PCT/US00/34263 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconverted to the Fab '-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
5 antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a folly humanized bispecific antibody F(ab')2 molecule. Each Fab' fragment 

10 was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 

1 5 recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et al., J. Immunol. 148(5): 1547-1 553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 

20 also be utilized for the production of antibody homodimers. The "diabody" technology 

described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (Vh) connected to a light-chain variable domain (Vl) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 

25 the Vh and V L domains of one fragment are forced to pair with the complementary V L and V H 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 

30 antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 

35 IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRIII (CD16) so as to focus cellular 
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defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
5 binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 

1 0 have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 

1 5 Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
20 to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region., The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1 191-1 195 (1992) 
25 and Shopes, J. Immunol., 148: 291 8-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifiinctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

30 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
35 radioconjugate). 

83 



WO 01/53312 



PCT/US00/34263 



Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
5 Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 



protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), Afunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 

15 bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
Carbon- 14-labeled l-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 

20 WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 

25 conjugated to a cytotoxic agent. 

4,14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 

30 any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 

35 be used to create a manufacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 
5 A variety of data storage structures are available to a skilled artisan for creating a 

computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 

1 0 readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats (e.g. text file or database) in order to obtain computer readable medium having recorded 

1 5 thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO: 1-1786 and 3573-5358 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Computer 

20 software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST (Altschul et al., J. Mol. Biol. 215:403-410 (1990)) and 
BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 
is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such ORFs may 

25 be protein encoding fragments and may be useful in producing commercially important proteins 
such as enzymes used in fermentation reactions and in the production of commercially useful 
metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 

30 present invention. The nraiimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 

35 therein a nucleotide sequence of the present invention and the necessary hardware means and 
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software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 
5 As used herein, "search means" refers to one or more programs which are implemented 

on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 

1 0 available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 

15 computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 

20 residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
25 three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

30 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
35 Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
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designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al, Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241:456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
5 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

10 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 

1 5 with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 

20 comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 

25 a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 

30 binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 

35 amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
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probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
5 and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 

1 0 extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 

1 5 provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 

20 containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 

25 sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 

30 reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 



4.17 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l- 
1786 and 3573-5358, or bind to a specific domain of the polypeptide encoded by the nucleic 
acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the present 
invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
5 invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

10 For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 

1 5 readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

20 In addition to the foregoing, one class of agents of the present invention, as broadly 

described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 

25 multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

30 Agents suitable for use in these methods preferably contain 20 to 40 bases and are 

designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et 
al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 

35 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mKNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 
5 Agents which bind to a protein encoded by one of the ORFs of the present invention can 

be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

10 4,19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO:l-1786 and 3573-5358. Because the corresponding gene is only 

1 5 expressed in a limited number of tissues, a hybridization probe derived from of any of the 
nucleotide sequences SEQ ID NO: 1 - 1 786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 

20 additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a ihixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 
Other means for producing specific hybridization probes for nucleic acids include the 

25 cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 

30 nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Venna et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
5 of genetic map data can be found in the 1 994 Genome Issue of Science (265 : 1 98 1 f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

1 0 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 

1 5 skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagata et al f 1 985; Dahlen et al, 1 987; Morrissey & Collins, (1 989) Mol. Cell 
Probes 3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al 1 988; 1 989); all 

20 references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et al ( 1 994) Proc. Natl. Acad. Sci. USA 9 1 (8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 

25 Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 

Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
30 surface termed Co valink NH. CovaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5 r -end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussenefa/., (1991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLink NH strips for covalent binding of DNA molecules at the S'-end has 
been described (Rasmussenetal., (1991). In this technology, a phosphoramidate bond is employed 
(Chu et al. , (1 983) Nucleic Acids Res. 1 1 (8) 65 1 3-29). This is beneficial as immobilizationusing 
only a single covalent bond is preferred. The phosphoramidatebond joins the DNA to the 
5 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 

1 0 More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 

denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M 1-methylimidazole, 
pH 7.0 (1-Melm 7 ), is then added to a final concentration of 10 mM 1-Melm 7 . A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

Carbodiimide 0.2 M l-ethyl-3-(3-dimethylaminopropyl)-carbodiimide(EDC), dissolved in 

15 10 mM 1 -Melm7, is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is that 

20 described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3'-reagentthrough the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 

25 conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 

30 FodoretaL (1991) Science 25 1(4995) 767-73, incorporated herein by reference. Probesmay also 
be immobilized on nylon supports as described by Van Ness et al (1991)Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness etal (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5'-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
5 light-generated synthesis described by Pease et al, (1994) PNAS USA 91(1 1) 5022-6, incorporated 
herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5'-protectediV-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
1 0 combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and KNA, 
1 5 including mRNA without any amplification steps. For example, Sambrook et al (1 989) describes . 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in M 1 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 

20 may be prepared or dispensed in multiwell plates. About 1 00- 1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et 
al (1989), shearing by ultrasound and NaOH treatment. 

25 Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 

Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 

30 fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease Cv/JI normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
5 this enzyme (Cv/JI* *), yield a quasi-random distribution of DNA fragments form the small 
molecule pUC 19 (2688 base pairs). Fitzgerald et al (1992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a Cv/JI* * digest of pUC 1 9 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M13 cloning vector. Sequence analysis of 76 clones showed that CviJI** restricts pyGCPy and 
1 0 PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
1 5 electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturationof the DNA fragments before they are contacted with the 
20 chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microliter plate) to repeated by transfer of about 20 nl of a DNA solution to a 

25 nylon membrane. By offset printing, a density of dots higher than the density of the wells is 

achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 

30 subarrays may represent replica spotting of the same samples. In one example, a selected gene 

segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 
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Subarraysmay contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
5 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 

1 0 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 

1 5 variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

20 5.0 EXAMPLES 

5.1 J EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 

25 using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 

inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

30 In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 

Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosy stems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNA Ends) was performed to further extend the sequence in the 5' direction. 



WO 01/53312 PCT/US00/34263 
5.1.2 EXAMPLE 2 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
5 the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 1 14, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BL ASTN hit to the extending assemblage 
1 0 with BLAST score greater than 300 and percent identity greater than 95%. . 

A polypeptide was predicted to be encoded by each of SEQ ID NO:3573-5358 as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
http://fasta.bioch.virginia,edu > > which selects a polypeptides based on a comparison of translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 1 83 :63-98 
1 5 (1990), herein incorporated by reference. The predicted polypeptides are shown in Table 7. 

5.2.2 EXAMPLE 3 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

20 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phujedPhrap and Consed (University of Washington) and 
ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS : 1 - 327. 

25 Table 1 shows the various tissue sources of SEQ ID NO: 1 -327. 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a FASTA version 3 
search against Genpept release 117, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon frame shifts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: 1-327 from Genpept . The translated amino acid 

30 sequences for which the nucleic acid sequence encodes are shown in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 1-327 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu^et al., J. Comp. 
BioL, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 

10 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

1 5 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.3.2 EXAMPLE 4 

20 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 117, gbpri 117, 

25 UniGene version 1 1 7, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 328-1 413. 
Table 1 shows the various tissue sources of SEQ ID NO: 328-141 3. 

30 The nearest neighbor results for SEQ ID NO: 328-1413 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 118, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 328-1413 from Genpept. 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shown in 
Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
5 examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
10 - examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI . 1 program (from 

15 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

20 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences.- Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

25 53.2 EXAMPLES 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 117, gbpri 117, 

UniGene version 117, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hy seq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS : 14 14- 1 652. 
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Table 1 shows the various tissue sources of SEQ ID NO: 1414-1652. 
The nearest neighbor results for SEQ ID NO: 1414-1652 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 118, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 1414-1652 from 
5 Genpept. The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ ID NO: 1414-1652 are 
shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
10 examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
15 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p- value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 

20 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

25 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.4.2 EXAMPLE 6 
30 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 18, gb pri 118, 
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UniGene version 118, Genpept release 118). Other computer programs which may have been used 

in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 

ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 

resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1653-1 745. 

5 Table 1 shows the various tissue sources of SEQ ID NO: 1653-1 745. 

The homology for SEQ ID NO: 1653-1745 were obtained by a BLASTP version 2.0al 

19MP-WashU search against Genpept release 118, using BLAST algorithm. The results showed 

homologues for SEQ ID NO: 1653-1745 from Genpept. The translated amino acid sequences for 

which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 

10 with identifiable functions for SEQ ID NO: 1653-1745 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

15 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p- value and the pFam score for the identified domain 

20 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

25 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 

30 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5^2 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 

checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 19, gb pri 1 1 9, 

5 UniGene version 119, Genpept release 119). Other computer programs which may have been used 

in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 

ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 

these procedures are shown in the Sequence Listing as SEQ ID NOS: 1746-1768. 

Table 1 shows the various tissue sources of SEQ ID NO: 1 746-1 768. 

10 The homology for SEQ ID NO: 1746-1768 were obtained by a BLASTP version 2.0al 

19MP-WashU search against Genpept release 119, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1746-1768 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1746-1768 are shown in Table 2 below. 

15 Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 

Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in the indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the position(s) of the signature within the polypeptide sequence. 

20 Using the PFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 

pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

25 The nucleotide sequence within the sequences that codes for signal peptide sequences and 

their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication " 

30 Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. l,pp. 1-6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5.6.2 EXAMPLE 8 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
5 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 120,gbpri 120, 
UniGene version 120, Genpept release 120). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 
1 0 sequence encodes are shown in the Sequence Listing. The full-length nucleotide, including splice 
variants resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1769- 
1786. 

Table 1 shows the various tissue sources of SEQ ID NO: 1769-1786. 

The homology for SEQ ID NO: 1769-1786 were obtained by a BLASTP version 2.0al 
15 19MP-WashU search against Genpept release 120 and the amino acid version of Geneseq 
released on October 26, 2000, using BLAST algorithm. The results showed homologues for 
SEQ ID NO: 1769-1786 from Genpept. The homologues with identifiable functions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Cpmp. 
20 Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
25 pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 

examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
30 their cleavage sites can be determine from using Neural Network SignalP V 1 . 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by HenrikNielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
35 cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
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reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 

each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

5 Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 
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TABLE 1 



Tissue Origin 


RNA Source 


Hyseq 
library Name 






SEQ 


ID NOS: 




adult brain 


GIBCO 


AB3001 


9 19-21 50- 


51 65-66 


72 


78 80 82 








85 


87 107-108 113 116 123 138 








140 


150-152 


159 


169 


177 


192-193 








202 


-203 212 


-214 


225 


-226 


235-236 








251 


258 268 


-269 


272 


280 


-281 295 








298 


301 321 


326 


331 


-332 


334 356- 








357 


362 369 


379 


382 


-383 


416 423 








443 


459-460 


473 


475 


477 


488 496 








500 


503 519 


526 


547 


574 


582 587 








608 


-609 613 


618 


633 


-634 


645-646 








652 


657-658 


660 


669 


-671 


678 687 








695 


697 710 


715 


724 


731 


775-777 








796 


804 811 


857 


-859 


862 


869 899- 








900 


912 919 


922 


924 


-929 


933 936 








962 


979 988 


-989 


996 


1001 1004- 








100 


8 1018 1039 


1047 


1059 1064 








106 


7 1070 1 


078 


1082 


1107 1113 








1116-1117 1 


131 


1134 


-113 


7 1140 








114 


9 1151 1157 


1180 


120 


S 1229 








1234 1241 1243 


1258 


1272-1273 






/ 


1279 1288-1290 


1294 


130 


7-1308 








1312 1320 1323 


1330 


1356 1360- 








136 


1 1368 1373- 


1375 


1379 1391 








1400 1417 1446 


1468 


1482 1493- 








1494 1501-1 


503 


1506 


-1507 1512 








1517 1522-1524 


1530 


-1533 1537 








1549 1565 1578 


1598 


1606 1608 | 








1623 1625 1627 


1639 


1643 1648- ! 








1649 1653 1664 


1667 


1671 1696 








1734 1741 1743- 


1744 


1760-1761 








1771 










adult brain 


GIBCO 


ABD003 


3 12-14 18-19 25 30- 


-31 34-36 43- 








45 50-51 56 


58 


60 65-66 


68-69 80 








82 85 87 92 


104 


107- 


-108 


112-113 








115- 


-116 123- 


-124 


131- 


-132 


135-137 








139 


142 146 


148 


-149 


152 


154 157 








159 


163 165 


167 


169 


172 


180 192- 








193 


196-197 


199 


203 


208 


210 212- 








214 


223 233 


235 


-237 


247 


257 259 








261 


268-269 


272 


276 


280- 


281 284- 








288 


291-292 


295 


297 


300-301 304 








307 


317 320-321 


323 


327 


329-331 








333-334 345-349 


356- 


-357 


379-381 








393 


401 408 


414 


419 


424 


426-428 








430 


433-436 


438-439 


443 


445 449 








453-454 459-461 


468 


471- 


473 476- 








47B 


483 491 


494 


496 


500 


503 507- 








508 


516 519- 


520 


525- 


527 


534 536- 








540 


542-543 


545 


553 


555 


560 569- 








570 


574-576 


586-588 


593 


595 597 








601 


606-609 


616- 


-620 


622- 


623 625 








628- 


633 635- 


636 


643 


645- 


649 653 








655- 


656 660- 


665 


668- 


670 


676 681 








687 


701 710 


715 


717 


724- 


728 735 








743 


745-746 


750 


753 


759 


765-766 








773 


775-77B 


786 


789 


796 


799-800 








802- 


803 810- 


811 


815 


817 


820-821 








832 


834-836 


840 


845- 


847 


851 858- 








861 


864 869 


874 


878 


883 


897 901- 








902 


904-905 


908 


911- 


914 


916 921- 








922 


924-927 


929 


932- 


934 


936-939 








941- 


942 945 


955- 


958 


963 


966-969 








977 


979-980 


985- 


986 


990 


992-993 








997- 


1001 1005-1007 1012 


1017- 








1020 


1023-1024 1029- 


1031 


1034 








1036 


1039 1050 1059 


1063 


-1066 








1078 


1081-10 


82 1085- 


1086 


1089 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1097 


1103 


1107 


1109 


1112 


1116- 


1117 


1119 


1121 


1124 


1127 


1130 


1134 


1144- 


-1145 


1149 


1151 


1157- 


115B 


1167 


1170 


1178 


1184 


1188 


1190 


1193- 


-1194 


1200 


1202 


1215- 


1217 


1220 


1226- 


-1227 


1229 


1231 


1241 


1243 


1247 


1252 


1258 


1263 


1267 


1269 


1279 


1281 


1284 


1286- 


1289 


1293- 


-1294 


1306- 


-1307 


1312 


1316- 


-1320 


1326 


1333 


1338 


1341 


1344 


1348 


1351 


1355- 


•1357 


1368 


1374 


1377 


1380 


1386 


1389- 


1390 


1394 


1400 


1409 


1414 


1422- 


1423 


1425- 


•1427 


1437 


1443 


1446 


1454 


1456 


1458- 


■1459 


1468 


1470- 


1472 


1478 


1482- 


-1483 


1487- 


■1488 


1493 


1497 


1499 


1506 


1508- 


■1511 


1517 


1522- 


■1524 


1530- 


-1533 


1545- 


1546 


154 8- 


•1550 


1552 


1557- 


•1559- 


1563 


1565 


1567 


1569 


1571 


1586 


1588 


1591 


1593 


1595 


1598- 


•1601 


1608 


1611 


1620- 


-1621 


1624- 


-1626 


162B 


1630- 


-1632 


1636 


1640- 


•1641 


1644- 


1645 


1647 


1649 


1653- 


•1655 


1657 


1664 


1667 


1669 


1673 


1678- 


1681 


1686 


1690 


1694-1696 


1701 


1709 


1711 


1719 


1722-1723 


1726-1727 


1731-1733 


1738 


1740 


1743- 


1744 


1747 


1749 


1753 


1757-1758 


1760- 


1761 


1765 


1771 


1785 







adult brain 



Clontech 



ABR001 



9 29 68-69 113 115 146 152 206 
223 245 277 307 320 324 330-331 
344 348 352 362 379 384 393 404 
408 414 441-442 454 469 481 490 
506 517 586 597 631 641 659 691 
715 799 803 833 865 871 875 880 
882 908 920 937 1000 1005-1006 
1027 1036 1041 1043 1075 1107 
1112 1121 1127 1136-1137 1144- 
1147 1231 123B-1239 1280 1293 
1320 1345 1355 1361 1383-1384 
1400 1417 1448 1456 1476 1507 
1570 1572 1609-1610 1614 1620 
1626 1645 1653 1754 1759 1770 
1786 



adult brain 



Clontech 



ABR006 



5-8 15-16 168 212-213 271 278 
280-281 291-292 300-301 310 314 
321 326 336-338 341 352 357 359- 
360 362 369 374 379 384 393 396- 
397 414 419-420 426-428 430 441- 
442 453 506 616-617 661 689 785 
798 845 1018 1109 1113 1124 1148 
1167 1187 1207 1227 1262 1265 
1285 1312 1317-1319 1324-1327 
1344 1369 1381 1400 1416 1421 
1427 1430-1431 1436 1471 1501 
1557-1559 1586 1588 1651 1653 
1664-1665 1671 1673 1690 1697- 
1698 1700 1711 1717 1719-1720 
1728 1736 1740 1743-1744 1757 
1760-1761 



adult brain 



Clontech 



ABR008 



5-10 13-19 22-23 25 29 33 37-39 
43-45 50-51 54-55 57-58 60-66 
68-70 72 75 77-80 83 85 89-92 94 
99-105 108-110 112-113 116-117 
123 128 133 135-137 139 143 145- 
146 148 152 154-155 157 166 168- 
172 174-175 181-184 188-190 193- 
194 196 198-200 202 204-205 207- 
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Tissue Origin 


RNA Source 


Hyeeq 
Library Name 


SEQ ID NOS: 








208 


210 


214 


-215 218 


221-226 229 








231- 


■232 


234 


-241 245 


-247 


251-253 








255 


257 


-259 


268-269 


271 


276-281 








285- 


-286 


288 


290-292 


300- 


-302 304 








307 


309 


-311 


313 315 


317- 


-318 320- 








322 


325 


-326 


328 330 


-331 


333-338 








341 


344 


-347 


34 


9 352 


354 


356-357 








362 


369 


-373 


376 379 


-380 


382 384 








387 


390 


-391 


393-394 


397 


399-403 








405- 


411 


414 


-41 


5 417 


-420 


426-428 








43 7- 


438 


440 


-444 453 


-455 


462 464 








467 


469 


-471 


476 478 


482- 


•484 488- 








491 


497 


503 


506-513 


516- 


517 520 








524- 


526 


528 


-53 


0 532 


-534 


537-540 








, 542 


544 


547 


-551 553 


561 


565-567 








572- 


574 


577 


581 585 


587- 


588 590- 








591 


597 


599 


601-602 


606- 


610 612 








615- 


617 


619 


-620 622 


-623 


628-629 








631 


633-634 


63 


6-641 


643 


645-647 








651- 


653 


655 


-664 669 


-671 


673 679 








632 


687 


689 


69 


1-700 


702 


706 710 








715- 


717 


720 


-72 


1 72S 


-734 


736-739 








742- 


743 


746 


75 


0-752 


756 


758-759 








762- 


764 


766 


76 


8 773 


-778 


780-782 








784- 


785 


787- 


-78 


9 794 


796 


799 802- 








803 


805 


811 


81 


4-815 


818 


825-826 








834- 


837 


839- 


84 


0 842 


-843 


856-859 








861- 


862 


865 


867-872 


874- 


875 881 








883- 


884 


887 


889-892 


894- 


895 897- 








898 


901 


904 


908 910 


912 


914 917 








919 


921- 


924 


926-927 


930- 


932 935- 








941 


943 


945 


949 953 


-954 


958 961- 








963 


967 


969 


971 975 


977 


981-983 








986 


988- 


990 


992 997 


999- 


1002 








1004 


-1006 1008 


1012 


1018 


-1023 








1027 


1029-1031 


1035- 


-1037 


1047- 








1048 


1053 1057 


1059 


1063 


1068 








1070 


1072-1075 


1077 


1081 


-1083 








1085 


-1093 1095 


-1096 


1108 


-1112 








1114 


-1125 1127 


1131-1133 


1135- 








1138 


1142-1145 


1148-1158 


1160- 








1163 


1167 1169 


1172 


1175 


1177 








1180 


1183-1188 


1191- 


•1195 


1199- 








1200 


1204 1206 


1211 


1213 


-1216 








1222 


-122 


3 1226 


-1227 


1229 


-1231 








1234 


-123 


5 1241 


-1242 


1244 


-1263 








1266 


1269-1271 


1275- 


1277 


1279- 








1281 


1284-1286 


1292 


1294 


-1295 








1299 


1305-13 


09 


1312 


1314 


1316- 








1319 


1322 13 


24- 


•1327 


1330 


1332 








1334 


-133 


5 13 


39 


1344- 


1346 


1351 








1354 


-1355 13 


57- 


-1358 


1365 


-1367 








1369 


-137 


0 13 


73- 


-1374 


1376 


-1379 








1381 


-1384 13 


86- 


•1388 


1392 


1394 








1396 


-139 


7 14 


00 


1403- 


1407 


1410 








1414 


1419-1420 


1423 


1432- 


-1433 








1435 


143 


7-14 


38 


1440- 


1442 


1446 








1448 


14S3-14 


55 


1457 


1461 


1463- 








1464 


1466 14 


68 


1471 


1477 


1480 








1482- 


-1483 14 


96 


1502- 


1504 


1507- 








1509 


1513 1519- 


1520 


1524- 


•1526 








1536 


1547 1549- 


1552 


1567 


1573- 








1574 


1578 1586- 


1589 


1597- 


•1598 








1601-1602 1605 


1607- 


1609 


1611- 








1617 


1619-1621 


1623 


1625- 


1626 








1635-1641 1643- 


1645 


1649 


1651 








1653 


1656-1658 


1664 


1669 


1671- 








1674 


1676-1684 


1686 


1689-1690 








1694- 


1696 1704- 


1705 


1708- 


1709 
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Hyseq 
Library Name 



SEQ ID NOS: 



1720-1724 1726-1728 1730-1733 
1737-1740 1742-1745 1753 1756- 
1757 1759-1761 1765 1767 1771- 
1772 1776-1777 1779-17B0 1786 



adult brain 



Clontech 



ABR011 



adult brain 



BioChain 



ABR012 



24 75 103 186 210 310-311 364- 
365 508 623 710 937 1002-1003 
1059 1204 1609 1731-1732 



adult brain 



46 182-184 204-205 300 739 767 
1371 1549 1620 1684 



Invitrogen 



ABR013 



adult brain 



185 204-205 364-365 393 497 595 
687 692-694 830 845 1068 1320 
1413 1640 



Invitrogen 



ABR014 



187 301 357 364-365 375 454 463 
731 859 939 983 1073 1262 1270 
1320 1403 1640 1651 1657 1696 
1722 1738 



adult brain 



Invitrogen 



ABR015 



adult brain 
adult brain 



419 434-435 441-442 763 789 983 
1320 



Invitrogen 



ABR016 



312 364-365 379 1320 1334-1335 
1674 1722 1785 



Invitrogen 



ABT004 



cultured 
preadipocytes 



Strategene 



ADP001 



14-16 22-23 25 37-39 43 58 60 
70-72 78 86 94 107 113 116 136- 
137 143 146 152 161 173 182-184 
194 196 198 210 218 229 259 267 
295 298 309-310 320-321 324 336- 
338 346-347 349-350 356-357 362 
371 379-380 382-383 391 393 396 
399 401 408 428 438 459 461 476 
482 490 502 507-509 516 526 531 
557 562 597 602 607-609 624 652 
655 667 669 671-672 687-689 695- 
696 710 712 715 721 732 739 743 
750 753 766 778 780-781 789 803 
814 826 830 837 841 857 869 874 
894-895 925 937 949 954-956 960- 
961 963 968-969 988-989 1000 
1005-1006 1016-1019 1021 1036- 
1037 1052 1086 1090 1109 1113 
1115 1120-1121 1123-1124 1136- 
1137 1140 1144-1147 1151 1167 
1170 1174 1188 1193-1194 1205 
1225 1229 1231 1254 1258 1262 
1280 1285 1309 1312 1334-1335 
1341 1343-1344 1356-1357 1370 
1378-1379 1383-1384 1403-1404 
1423 1429 1434 1442 1448 1451- 
1452 1454 1470-1472 1482 1499 
1525 1528-1529 1532 1536 1547 
1554 1557-1559 1561-1562 1567 
1585 1588 1590 1595 1601-1604 
1608 1610-1613 1615 1619 1624 
1627 1640 1644 1647 1660 1664 
1666 1670 1675 1696 1704 1715 
1723 1727 1738 1760-1761 1768 
1779 1785-1786 



5-8 11 17 25 68-69 80 82 87 103 
105 110 116 136-138 168 171 188- 
189 196-198 261 267 276 288 293 
301 318 331 336-338 379-380 391 
400 428 430-431 510-512 520 524 
527 549 5S7 561 602 618 620 622 
631 637 647 670 681-682 710 731 
748 782 793-794 817 834-836 843 
845 858-859 879 882 893-895 934 
960 982 986 995-996 1000 1002 
1005-1007 1025 1027-1028 1032 
1039 1045 1071 1078 1097 1099- 
1102 1136-1137 1140 1219-1220 
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RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1260 
1322 
1370 
1437 
1602 
1660 
1711 
1760 



1271 
1329 

-1371 
1466 
1608 
1662 
1719- 

•1761 



1297- 

1339 

1398 

1468 

1614 

1673 

1720 

1765 



1298 

1345 

1408 

1533 

1631 

1687- 

1742 

1767 



1314 

1365- 

1423 

1539 

1649- 

1688 

1746 

1771 



adrenal gland 



1320 
1366 
1431 
1594 
1650 
1696 
1749 
1785 



Clontech 



ADR002 



4-10 15-16 25 29-31 43-45 47 50- 
51 55 60 62-63 65-66 75 80 102 
116 11B 122 126 130 137 150 169- 
170 181 192 198 201-203 215 227- 
228 247 251 255 267-269 271 280- 
281 285 295 298 311 336-338 342 
349 351-352 354 372-373 383-385 
391 400 410 415-416 424 426-427 
431 434-437 439 445 454 461 473 
477 483 491 493 497-498 503 516 
519 527 535 546 549 552 572-573 
581 588 595 600 602 608-610 620 
628-630 637 645-646 670 679 703 
713 715 719 732 734 744-746 758 
773-778 789 816 829 837 845 848 
B69 875 883 898 904 912 922-923 
930-931 942 948 952 965 967 969 
976-977 981 990 992-993 1001 
1004 1049 1055 1059 1071-1072 
1076 1112-1113 1115 1121 1127 
1134-1135 1151 1158 1163 1175 
1181 11B8 1209 1218 1224-1225 
1227 1231 1243 1270-1271 1274 
1280 1285 1290 1293 1307 1324- 
1325 1327 1330 1342-1343 1345 
1348 1365-1366 1369 1378-1379 
1387 1398 1400 1405 1417 1425- 
1426 1436 1440-1441 1444 1454 
1463-1464 1488 1491 1507 1512 
1538 1545 1567 1573-1575 1588 
1598 1609 1614 1618 1622 1624 
1627 1634 1636 1649 1651 1658 
1671 1674 1678-1679 1691-1692 
1703 1717 1727 1731-1732 1737 
1765 



adult heart 



GIB CO 



AHR001 



4-8 10-11 15-16 
46 50-52 57-58 
85 87 89 94 97 
110 112 114 116 
127 130-132 134 
147-151 153 163 
186 192 195 197 
215 220 225-226 
236 251 257-260 
277 280-282 285 
298-301 304 307 
325 330 333 336 
352 354 358 361 
384 387-388 391 
408-409 411-412 
433-439 445-446 
457* 459 462 469 
483-484 487-490 
503 506 508 510 
526 534 536-540 
560-562 574-577 
587 589 593 595 
612 615-620 622 
645-652 656-660 
674-675 683-684 
701 709 712 715 



18-21 34-39 44- 
60 62-63 71 75 82 
100 103-104 108- 
118-119 122-123 
136-138 141-144 
-164 168-171 179 
199 204-205 212- 
229-230 232 234- 
262 265 272 274 
-286 289-292 296 

309 314 321 324- 
-338 345 349 351- 
368 370 380 383- 
393 397 401 406 
414-416 430-431 
449 452 454-455 
472-473 476-480 
492-493 496-498 
513 516 519-522 
542 546 549 553 
581-582 584 586- 
597 604-609 611- 
-623 626 632 637 
665-666 670-672 
687 692-694 697 
-716 719-720 725- 
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Tissue Oracrin 


RNA Source 


Hyseq 






SEQ 


ID NOS: 








Library Name 




















726 


728 730-73 


2 735 


738- 


739 743- " 








744 


746 751 753 759 


761 


765 770- 








771 


775-780 785 788 


-790 


796 802 








804 


810 812 817 821 


826 


828 830 








837 


843 845-84 


7 849 


-853 


857-861 








863 - 


864 869 871 875 


877- 


879 881 








883 


887 890-892 894 


-895 


897-898 








901 


903 906-907 911 


-913 


915 919 








921- 


925 927-928 933 


-935 


945 958 








y bi - 


963 967 969-972 


975 


977-978 








980 - 


986 990 992 999 


-1002 


1005- 








1007 


1010 


1016 


1019 


-1020 


1022- " • 








1023 


1025 


1028 


-1037 


1039 


-1040 








1043 


1047 


1050 


1054 


-1055 


1057 








1059 


1063 


-1064 


1067 


-1068 


1070 








1072 


1075 


-1076 


1083 


1085 


-1087 








1089 


1093 


-1094 


1104 


1106 


1108- 








1109 


1113 


1116-1117 


1119 


1121 








1124 


1126 


1128 


1131 


-1134 


1144- 








1145 


1148 


-1149 


1151 


1158 


1167 








1169 


-1170 


1175 


1177 


1192 


1196 . 








1199 


-1200 


1202 


1206 


-1208 


1211 








1216 


1218 


1222 


1227 


-1229 


1232- 








1235 


1238-1241 


1243 


-1244 


1247- 








1248 


1250 


1253- 


•1254 


1256 


-1258 








1261 


1268 


1270-1271 


1277 


1280- 








1282 


1287 


1292 


1298 


-1299 


1306 








1308 


1317-1321 


1324 


-1325 


1330 








1332 


1334-1337 


1339 


1344 


-1345 








1349 


-1350 


1354-1356 


1359 


-1360 








1365 


-1366 


1369 


1371 


1374 


-1375 








1378-13B0 


1383- 


1384 


1389 


1397 








1400 


1403 


1409 


1417 


1423-1426 








1437 


1439 


1442 


1444 


1446-1447 








1450 


1453 


1468 


1470 


1473 


1479 








1481 


1488 


1490 


1501-1504 


1519 








1521 


1524 


1528 


1530-1534 


1536- 








1537 


1539 


1541- 


1542 


1547 


1553 








1555 


1560* 


1565 


1567-1571 


1588 








1591 


1597- 


1598 


1601- 


•1602 


1605 








1614- 


-1616 


1619- 


1620 


1623-1628 








1630- 


-1632 


1634 


1636 


1641 


1644- 








1645 


1647 


1649 


1652- 


-1655 


1659 








1662 


1667 


1673- 


1674 


1680- 


-1681 








1684 


1686- 


1688 


1704- 


•1705 


1709 








1711- 


•1712 


1717 


1724 


1726- 


-1727 








1731- 


•1733 


1737- 


1738 


1741 


1743- 








1744 


1749 


1754- 


1755 


1760- 


■1761 








1765 


1772 


1785 












AKD001 


4-8 10-11 


17-21 


29-3 


1 35 : 


•39 42- 








45 50-51 56-58 


60-61 64 68-69 75 








77 80 82 B5 87 


92-94 


97 100 102- 








104 107-108 112 


116- 


117 119 123 








127-133 136-137 


139- 


141 143-144 








147-154 157 161 


-163 


165-166 169 








172 176 178-179 


192 


194-1 


97 199 








201 203-206 209 


-210 


212-2 


13 215- 








216 223-228 234 


-236 


238 247 251- 








253 257-259 261 


-262 


265-2 


69 271- 








272 274 276-277 


279- 


281 2 


84-286 








290 293 296 298 


-299 


301-3 


02 304 








307 311-313 321 


325- 


326 3 


29-331 








333 341 344 348 


-350 


352 3 


56 358- 








359 3 


62 364-365 


368 


370-3 


72 374 








376-3 


77 380-382 


392 


395 3 


98 400- 








401 4 


04 407-409 


414- 


415 423-424 








430-437 443-444 


446 


449 4 


51 453- 








455 459 461-462 


464 


467 4 


69 471- 1 








474 476-477 480 


-481 


483 4 


87-488 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ 


ID NOS: 








Library Name 




















490- 


491 493 49 


7-505 


510- 


513 516- 








520 


522 524 526-529 


534 


537-540 








544 


547 549 554-556 


560 


562 564 








567 


571-576 578 582 


586- 


589 592- 








593 


598-599 601 604 


-606 


608-613 








615- 


619 621-626 632 


-634 


637-643 








645- 


652 655 660-664 


669- 


672 676 








678- 


679 688 692-695 


698 


702 711 








713 


717 719-720 727 


731 


735-736 








738 


743 745-746 751 


753 


755 762- 








763 


765 771-773 775 


-778 


780 786 








788 


793 795-796 800 


803 


805 808 








810- 


812 8 


14-819 821 


826 


829 832 








834- 


838 842-845 848 


-855 


857-861 








864- 


865 8 


57 869 B71 


874 


876-883 








886- 


887 889-891 893 


-896 


898-900 








902 


906-908 910-914 


918 


920 922 








925- 


927 929-935 937 


940- 


942 945 








948- 


949 951 953-958 


960- 


961 963- 








964 


969-970 972 976 


-978 


982-986 








988- 


990 9 


92-993 995 


-997 


999-1002 








1004 


-1008 


1010 


1012 


-1013 


1016- 








1017 


1019 


-1020 


1022 


1025 


-1031 








1035 


1038 


-1040 


1042 


1044 


1047 








1050 


1054 


-1055 


1057 


-1064 


1068 








1070 


-1073 


1078 


1085-1086 


1088- 








1089 


1092 


1094 


1097 


1099 


-1102 








1107 


1109 


-1112 


1116-1119 


1121 








1123 


-1125 


1132-1135 


1140 


1142- 








1143 


1146 


-1147 


1149-1150 


1153- 








1154 


1157 


1159 


1163 


1167 


1170 








1178 


-1179 


1181 


1183 


1192 


1296- 








1200 


1202 


-1204 


1206-1211 


1216- 








1219 


1221-1222 


1225 


1227 


-1230 








1232 


-1234 


1238-1241 


1243 


-1244 








1246 


-1247 


1253 


1257- 


-1258 


1260- 








1261 


1267-1268 


1270 


1272 


-1274 








1281 


1283 


1287-1289 


1293 


-1295 








1299 


1306 


1308 


1311- 


■1313 


1317- 








1320 


1323 


1329- 


■1330 


1334 


-1335 








1339 


1341 


1349- 


•1350 


1353 


-1357 








1359 


1367 


1369 


1373 


1375 


1378- 








1379 


1394 


1397 


1400 


1403 


1405 








1407 


-1409 


1417 


1419 


1423 


-1424 








1428 


-1431 


1433 


1437- 


1438 


1442- 








1443 


1445- 


•1446 


1448- 


-1450 


1453- 








1454 


1459 


1461 


1465- 


•1468 


1474- 








1475 


1478 


1484- 


1488 


1490 


1492- 








1493 


1495 


1497- 


1498 


1506 


-1507 








1509 


1512 


1518 


1521- 


•1522 


1525 








1527 


-1528 


1532- 


1533 


1537 


1540- 








1541 


1547-1550 


1552 


1556 


-1559 








1561 


1565- 


1566 


1568 


1571 


1575 








1578 


-1579 


1583 


1586- 


1587 


1589 








1591 


-1592 


1594 


1598 


1600 


1603- 








1604 


1606 


1608 


1611 


1613 


1615- 








1616 


1618-1622 


1624- 


1628 


1631- 








1632 


1634- 


1636 


1638- 


1639 


1641 








1644 


1646- 


1649 


1653- 


1656 


1662 








1664 


1666- 


1667 


1670- 


1671 


1676- 








1679 


1683- 


1684 


1686 


1691 


-1692 








1696 


-1699 


1701 


1709- 


1711 


1713- 








1714 


1716- 


1719 


1723- 


1724 


1726- 








1727 


1733 


1737- 


1738 


1741 


1743- 








1744 


1748- 


1749 


1751 


1760 


-1761 








1763-1768 


1778 


1780 


1785 




adult kidney 


Invitrogen 


AKT002 


20-21 37-3 


9 47 


52 SI 


60 65-66 








68-69 80 104 107-108 


122 


130 133 








136-137 14 


0 142 


-143 


149 169 174 
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RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



181 197 227-228 235-236 244 251" 
261-265 267 280-281 286 290 299 
301 304-305 309 312-313 339 341 
344-345 349 358 370-372 376 382- 
383 387 392 401 414 416 421 430 
443 445 449 453-454 472 487-488 
504 506 513 516 519 522 528 536- 
540 546 554 565 587 594 598 602 
607 616-617 626-627 636 643 662- 
664 695 709 721 735 743 761 768 
775-777 788 796 804 814 827 837- 
838 849-850 852-853 869-870 881 
890-892 898 903 905-907 914 919 
925 927 934 941 949 952 957 960 
962 968 970 1000 1008 1029-1030 
1044 1052 1055 1063 1067-1068 
1073 1085 1099-1102 1107 1110- 
1111 1113 1115 1119 1126 1134 
1136-1137 1146-1148 1153 1159 
1192 1196 1199 1232-1233 1241 
1256 1264 1272-1273 1281 1285 
1293-1294 1299 1312 1320 1324- 
1325 1330 1344 1349 1351 1355- 
1356 1369 1378-1379 1403 
1419 1428-1429 1436 1446 
1463-1464 1467-1468 1470 1477- 
1478 1486 1491 1509 1519 1527 
1529 1534 1547 1596 1600 1619 
1623 1629 1631 1634 1638 1643 
1647 1652 1660 1664 1667 1669- 
1670 1673 i686 1709 1727 1740 
1776 



1414 
1458 



adult lung 



GIBCO 



4-8 14 37-39 44-46 
63 75 82 88 93 103- 
133 140 143 150 152 
171-172 174-175 190 
211 214 219 223-224 
252 256 265 272 274 
310 332 345 351 362 
394 408-409 431 436 
461 467 469 471 476 
513 527 537-540 544 
564 583 607 616-617 
634 645-646 662-664 
719 743-744 763 766 
811 814 817 831-832 
852-853 858-859 861 
901 905 941 954-957 
979 981 987 990 992 
1005-1006 1014 1017 
1054 1059 1062 1064 
1086-1089 1094 1107 
1136-1137 1142 1150 
1190 1200 1208 1220 
1273 1280 1282 1295 
1331-1332 1353 1374 
1384 1404 1409 1423 
1442 1474 1478 1494 
1525 1531-1532 1547 
1554 1571 1598 1606 
1627-1629 1632 1642 
1669 1676-1677 1684 
1731-1732 1737-1738 
1766 



ALG001 



50-51 56 62- 
104 113 125 
154 157 162 
191 196 200 
227-228 251- 
280-281 285 
371 381-382 
445 454 459 
-477 488 504 
547-548 554 
621 623-624 
670 695 716 
774 789 803 
837-838 845 
866 880 887 
966 971 977 
996 1001 
1045 1047 
1072 1080 
1126 1134 
1157 1173 
1241 1272- 
1306 1320 
1379 1383- 
1434 1436 
1509 1522 
1549 1553- 
1613 1624 
1644 1662 
1696 1727 
1748-1749 



lymph node 



Clontech 



ALN001 



4 24 50-51 82 105 137 153 198 
201 223-224 234 268-269 272 2B0- 
281 287 301 312 329 343 382 421 
430 433 445 451 461-462 475 4B1- 
482 503 526 529 537-540 546-547 
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RNA Source 



Hyaeq 
Library Name 



SEQ ID NOS: 



young liver 



621 626 649 679 719 
793 803 831 834-836 
858 866 879 905 913 
1005-1006 1012 1038 
1117 1151 1199 1204 
1265 1274 1324-1325 
1374 1377 1440-1441 
1549 1600 1618-1619 
1644 1653 1687-1688 
1741 1771 



725-726 738 
838 844 857- 
928 963 976 
1050 1116- 
1226 1243 
1339 1353 
1447 1504 
1631 1641 
1691-1692 



GIBCO 



ALV001 



adult liver 



Invitrogen 



5-8 11 20-21 46 50- 
75 79 82 93 97 102- 
116 139 143-144 148 
174 187-189 194-195 
215 230 250 25B 267 
306 309 342 351 356 
374 392 394 398 401 
414 431 444 455 459 
493 510-512 516 520 
549 571 574-577 585 
607 621-624 628-630 
648 660 666-667 678 
717 719 728 730 734 
766 770 773 779 788 
814 841 849-851 871 
893 898-900 902-904 
919 922 924 934 953 
970 984 986 997 100 
1012 1029-1030 1033 
1061 1066 1070 1076 
1093 1099-1102 1110 
1117 1119 1121 1125 
1144-1145 1156-1157 
1199-1200 1209 1211 
1241 1244 1262 1270 
1283 1295 1317-1320 
1344 1359 1362-1363 
1384 1403 1415 1430 
1450 1467 1475-1476 
1494-1495 1498 1505 
1518-1519 1526 1529 
1552 1557-1559 1565 
1597 1609 1614 1620 
1641 1644 1654-1655 
1669 1684 1691-1692 
1725 1738 1741 1743 
1760-1761 1763-1765 



51 58 65-66 
103 108 110 
-149 171-172 

198 209 214- 
-269 280-281 
359 362 372 
407-408 410 
476 478 483 
522 526 536 
592 601-602 
632-633 637 
697-698 700 
73B 744-745 
800 808 812 
874 879 887 
906-907 911 
957 963 965 
1 1004 1007 
-1034 1052 
1085 1089 
-1112 1116- 
1136-1137 
1159 1196 
1219-1220 
1275 1279 
1332 1339 
1379 1383- 
1431 1437 
1483-1484 
1512 1516 
1547 1550- 
1583 1587 
1631 1637 
1662 1667 
1702 1711 
1744 1758 
1769 



ALV002 



5-8 17 20-21 32-33 41 55 58 64 
75 77 86 89 102 108 117 119 175- 
176 198 200 209 231 235-236 250 
272 275-276 284 306 316 321 325 
333 356 359 374 376 398 401 408 
414 428 430 433-435 454 476 494 
503-505 517-518 528 534 544 552 
561-563 567 578 581 608-609 630 
632 637 644 650 661 665 672 702 
707 710 721-722 750 753 778 782 
794 814 820 826 834-837 847 849- 
850 858 861 874 879 893 898 904 
911 918 921-922 926 946 94B 972 
978 986 996 1020 1027 1031 1034 
1053 1063 1068 1070 1073 1086 
1089 1093 1097 1113 1119 1156 
1159 1195 1198-1199 1208 1220 
1227 1241 1261 1272-1273 1277 
1285 1308 1315 1320 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1469 1482 1504 1524 1542 1547 
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Txssue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult liver 
adult ovary 



1550 
1597 
1618 
1647 
1669 
1738 
1765 



1567 
1601- 

■1619 
1652 

•1671 
1742- 
1772 



1578 1581 
1602 1611- 
1621 1625 
1654-1655 
1684 1706 
1744 1760- 
1774 



1583 1594 
1612 1615 
1637 1645 
1660 1666 
1722 1737- 
1761 1763- 



Clontech 



ALV003 



29 676 997 1063 1119 1536 1766 



Invitrogen 



AOV0Q1 



1 4-18 20-23 29 35 
51 53-58 61-63 65-6 
77-78 80 82 85 87 8 
103-104 106-108 110 
122-124 126 128 133 
142 145-147 149-157 
170 174 177-178 180 
189 192-203 207 209 
221-224 229-230 234 
247 255 258 260-262 
272 274 277-281 284 
295 299 301-302 304 
313-314 316 321 323 
333 335-338 341 344 
356 358 360 362 370 
379-384 387 390-392 
400 403 408-410 412 
424 426-427 430-435 
448-449 451 453-455 
471 473 476-479 481 
494 496-497 499-501 
514 516-517 519-520 
528-534 541-544 546 
554-555 561-564 566 
572-573 575-576 579 
588 590-591 593 595 
605 607-613 615 618 
630 632-633 636-640 
649-652 654-655 657 
677-678 681 683-684 
710 714-721 723 725 
734-735 743-746 750 
763 765 767 772-773 
783-784 786 788 790 
800 803 805 809-811 
819 821-824 826 828 
837-838 843-850 852 
867 869 871-872 874 
887-888 890-895 898 
916 919-922 924 926 
941 943-946 948-951 
961-964 966-967 970- 
985-986 988-990 992 
1001 1004-1009 1011- 
1019-1020 1024-1025 
1033-1035 1037 1039 
1050-1051 1054-1060 
1067-1070 1072-1073 
1078-1079 1085-1086 
1094-1096 1098-1103 
1112-1117 1119-1120 
1131-1135 1142-1143 
1153 1156 1158 1163 
1169-1171 1173-1175 
1180 1183-1185 1190- 
1197-1200 1202 1205 
1219 1221-1226 1232 
1241 1243-1244 1247 
1254 1256-1258 1262 
~126B 1270 1275 1278 
1286-1289 1291 1293 



40 42-48 50- 
6 68-69 73-75 
9 97 100-101 
113 115 118 
-134 136-140 
161 166 168- 
182-186 188- 
211-215 219 
242-243 246- 
265-269 271- 
-236 288 290 
307 309-311 
-326 330 332- 

349 352-353 
-372 376-377 
394 397-398 
414-416 423- 
439 443-446 
462-463 468- 
-484 487 489- 
503-505 509- 
522 524 526 
547 549 552 
-567 569-570 
581 583 585- 
597 599 601- 
-622 624-627 
642 644-647 
-665 667-675 

692-695 697- 
-727 729 732 
-751 753 758 
775-778 780 
-791 794-796 

813-815 818- 
-829 831-832 
-857 859-864 
-875 878-883 
-910 912-914 
-927 929-939 
953 955-958 
-979 981-982 
995-997 999- 
-1013 1016 
1029-1031 
1041-1047 
1062-1064 
1075-1076 
1089-1090 
1106-1108 
1123-1127 
1146-1149 
1165-1166 
1177-1178 
1191 1195 
1214 1217- 
1235 1238- 
1249 1252- 
1265 1267- 
1280-1283 
1294 1298- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult placenta 



Clontech 



1299 


1306 


1308 


1312 


1317 


-1321 


1323 


1327 


1329-1330 


1332-1333 


1338- 


-1339 


1341 


1343 


-1351 


1356 


1359 


1361 


1365- 


-1366 


1371 


-1375 


1377 


-1379 


1383- 


-1384 


1386 


1369 


1394 


1400 


1404 


1416 


-1417 


1422- 


1427 


1429-1431 


1435 


-1436 


1439- 


1443 


1445- 


1450 


1453 


-1454 


1459 


1463- 


-1464 


1466 


1468 


1470 


1474- 


1481 


1484- 


1485 


1488 


1491 


1493- 


1494 


1496- 


1498 


1501 


-1504 


1506- 


1507 


1511- 


1517 


1519 


1521- 


-1524 


1526- 


-1527 


1530- 


-1531 


1534- 


-1536 


1538- 


•1539 


1541 


1546 


1548- 


-1550 


1553 


1555- 


1559 


1561 


-1563 


1566- 


1567 


1569- 


1570 


1572 


1574- 


-1575 


1578 


1580- 


1581 


1587 


-1588 


1590- 


1591 


1595 


1597-1598 


1600- 


•1606 


1609 


1611- 


1621 


1623 


-1630 


1634 


1636 


1638 


1641 


1643 


1645 


1647- 


1657 


1659- 


1662 


1664 


1667 


1669- 


1671 


1673- 


1674 


1676 


-1681 


1683- 


1690 


1699 


1702- 


-1707 


1710- 


1711 


1713- 


1714 


1716- 


•1719 


1723- 


1724 


1726- 


•1728 


1731- 


•1733 


1735 


1737- 


173 8 


1740- 


1741 


1743 


-1744 


1748- 


1751 


1753 


1755- 


-1756 


1760- 


1762 


1765 


1767- 


1768 


1770 


-1771 


1776 


1778- 


1779 


1783-1784 


1786 





APL001 



5-8 44-45 90-91 107-108 159 178 
311 351 414 476 503 545 574 624 
636 719 755 773 860 B90-891 924 
947 955-956 962 990 992 1002 
1045 1202 1320 1369 1628 1686 
1713-1714 1743-1744 



placenta 



Invitrogen 



APL002 



adult spleen 



14-16 26 29 43 60-6 
106 116 135 171 177 
198 210 216 235-236 
309 329 334 339 359 
423 430 434-435 448 
491 517 522 631 723 
738 746 769 818 843 
858 916 948 953-954 
1005-1006 1013 1033 
1068 1070 1086 1139 
1160 1277 1285 1317 
1345 1429 1435 1438 
1486 1490 1512 1519 
1592-1593 1602 1626 
1664 1673 1675 1722 
1746 1776 



1 79-80 103 
180 194 196 
272 290 299 
379-380 417 
454 483 490- 
725-726 728 
854-855 857- 
976 988-989 
1036 1064 
1144-1145 
1320 1343 
1454 1482 
1532 1549 
1647 1649 
1727 1730 



GIBCO 



ASP001 



3 5-8 12 15- 
44-45 57 60 
103 106 108 
147 152-153 
178-180 196 
215 219 234 
272 280-281 
325 333 341 
387 394 406 
448 451 473 
505 517 519 
554 557 574- 
611-612 620- 
652 659 661 
700 721 728 
746 762 765 
810-811 817 
852-853 858 



16 19-21 24 
82-83 87 89 
117 119-121 
155 166 169 
198 201-206 
253-254 256 
290 295 302 
349 358 372 
414 431 434- 
481 490-493 
530 534 536- 
576 582 592 
621 623 631- 
667 671 673- 
730 732 738 
774 780 788- 
822 830 832 
862 866 874 



29 34-36 
94 98-99 
139 141 
171 174 
209-211 
258 264 
309 312 
382 386- 
436 446 
500 503 
540 547 
595 604 
632 642 
675 684 
742-744 
789 794 
845 848 
879 882 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



884 906-908 912 919 
927 934 942 949 957- 
978 983 990 992-994 
1005-1007 1010 1012 
1042-1044 1046 1049 
1070 1076 1089-1090 
1109 1113 1115 1124 
1170 1174 1177 1190 
1220 1226-1227 1229 
1246 1258 1269 1271 
1301 1320 1322 1330 
1339 1349 1351 1353 
1364 1369 1374 1386 
1417 1434 1436-1437 



1474 
1512 



1477 1480 1485- 
1522 1525 1544- 



testis 



GIBCO 



1560 1567 1591 1600 
1651 1654-1655 1658 
1674 1678-1679 1684 
1727 1733 1738 1740- 
1761 1774 1779 1781- 



921-923 926- 
958 963 977- 
996-997 999 
1031 1036 
1059 1068 
1094 1103 
1140 1163 
1196 1219- 
1236 1241 
1274 1295 
1334-1335 
1359-1360 
1397 1413 
1439 1468 
1487 1498 
1549 1553 
1631 1636 
1662 1670 
1686 1700 
1741 1760- 
1782 



ATS001 



5-8 10 26 30-31 47 50-51 57 68- 
69 82 84-85 97 102 113 119 137 
139 150 152 154 156 163 169 174 
176-177 192 194 196-197 212-215 
227-228 247 255 258 261 282 285 
288-289 301 307 311 316 330 334 
349 370-372 392 398 410 415 426- 
427 430-431 433 437 446 454 461 
469 473 477 481-482 493 499 502- 
503 513 522 526 547 552-553 563- 
564 572-573 575-576 581-582 585 
599-602 605 612 615-617 620 631 
637 647 649-650 656 660 665 670 
674-675 712 719-721 723 728 731 
738 744 746 773 780 784 78J8-789 
802 804 809 811 814 826 831 837 
843 845 848 859 866 869 877 905 
913 916 919 921 926 929 937 950 
960 963 971 975 977 981 990 992- 
993 1007 1016 1029-1030 1034- 
1035 1038-1039 1045 1059-1060 
1064 1070 1072-1073 1087 1089 
1097 1099-1102 1104 1108 1113 
1141 1149 1161-1162 1175 1208- 
1209 1222 1227 1229 1231 1235 
1238-1239 1243 1253 1285 1287- 
1289 1291-1293 1307 1311 1317- 
1320 1330 1332 1338 1345 1369 
1373-1374 1379 1389 1399-1400 
1409 1423-1424 1430 1435-1437 
1443 1459 1484 1486 1490 1493 
1496-1497 1501 1505 1509-1513 
1527 1530-1531 1533 1537 1546 
1549 1553 1565 1567 1569 1571 
1577 1586 1591 1599 1602 1625 
1628 1630-1632 1636 1639 1642 
1649 1661-1662 1666-1667 1670 
1675 1684 1690 1699 1705 1712 
1717 1724 1730 1737-1738 1752 
1767 1779 



Genomic DNA 
from BAC 63118 



Research 
Genetics 
(CITB BAC 
Library) 



BAC001 



68* 1*52 1412 



Genomic DNA 
from BAC 39316 



Research 
Genetics 
(CITB BAC 
Library) 



BAC002 



1411-1412 
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Tissue Origin I RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



Genomic DNA Research 
from BAC 39316 Genetics 
(CITB BAC 
Library) 



BAC003 



1352 



adult bladder 



Invitrogen 



BLD001 



S-8 17-18 22-23 33 37-39 56-57 
80 93 100 120-121 16? 201 237 
251-252 272 278 311 348 363 382 
413 415 424 430 443 483 502 542- 
543 562 564 607 616-617 626 635 
652 667 671 710 727 755-756 762 
773 786 788 837 840 866 893 898 
909 918 929 966 977 983 1016 
1025 1055*1073 1082 1140 1167 
1185 1189 1199 1270 1369 1481 
1536 1560 1573 1596 1614 1636- 
1637 1649-1650 1654-1655 1658 
1669 1671 1690 1719 1727 1731- 
1732 1739 1741 1760-1761 1779 



bone marrow 



Clontech 



BMD0Q1 



3-8 11 13 IB 29-31 33 35-36 40 
43-45 47-48 50-51 57 60 65-66 75 
80 82 85 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 172 178-180 
187 192-193 197-198 203-205 210- 
213 215 217 219 222 224-226 233 
235-237 242-244 255 258 260 263- 
264 266 273 276 278 283 286 290 
295 301-302 307 312-313 321 330 
333 339 343 352 357-358 370-371 
382 384-385 387 389 394 408 410 
412 416 421 424-427 429-431 436- 
437 439 441-442 445 447 454-456 
461-462 471-472 475 477-479 481- 
482 485 488 493 498 500 503-506 
513 516 519 523-524 526 530 535- 
540 542 544-545 549 555 565 567 
569-577 581 583-586 588 593 601 
603-604 608-609 613-619 621-622 
632-633 636-637 642 649-650 656- 
660 666 670 672 674-675 679 683 
701 708 716 718-720 731 735-736 
740-742 744-745 752 761 765 772- 
773 775-778 780 785-786 789-791 
796 798 802 810-812 823-824 826 
830 832-833 837-838 843-844 84B- 
855 858-859 866-867 869 878-880 
883 890-892 896 903 905 908 912- 
914 922-924 927 930-931 937 939- 
941 952-953 955-958 963 969 973 
976 981 985 987 990 992 995 1000 
1002 1005-1007 1013 1016 1025 
1028-1031 1033 1035 1037 1039 
1042 1044 1047 1050 1053-1054 
1059 1061 1063 1066 1070-1071 
1079 1106 1110-1113 1115-1117 
1124 1126 1134-1135 1142 1144- 
1145 1163 1172 1178 1197 1199- 
1200 1202 1216-1217 1224 1227- 
1228 1240 1246 1254 1261 1266 
1270 1278 1281 1285 1287 1290- 
1291 1293 1299-1301 1308 1314 
1317-1320 1327 1331 1339 1343 
1346 1349 1353 1356 1361 1367 
1369 1372-1374 1379-1380 1394 
1400 1403 1406 1408 1413 1417 
1419 1423 1425-1427 1430-1431 
1433 1439 1443 1446-1449 1459 
1463-1464 1482 1486 1493-1494 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1506 


1509 


1513 


1521 


-1522 


1524 


1526 


1528 


1531 


1536 


-1537 


1543 


1546 


154B- 


•1549 


1552 


1554- 


-1555 


1557 


-1559 


1571- 


1572 


1581 


1589- 


1592 


1597- 


•1600 


1609 


1614 


1621 


1626-1628 


1630- 


-1632 


1634 


1636 


1638 


-1639 


1641 


1646 


-1647 


1651 


1653 


-1655 


1661- 


-1662 


1676-1681 


1684 


1686 


1690 


1702 


1707 


1711 


1713 


-1714 


1717 


1720 


1722- 


-1723 


1727 


1737- 


1738 


1740 


1758 


1767 


1772 


1781- 


1782 


1785 


-1786 





bone marrow 



Clontech 



BMD002 



11 15-16 19 30-31 35-36 68-69 75 
83-84 93 99 103 108-109 118 137 
139 169-170 174 177 180 190 193 
212-213 219 222 225-226 232 237 
255 259 264 273-274 284 286 290- 
292 295 301 303-304 307 312-313 
316 324 326 330 334-335 348 352- 
353 357 360 370-373 384 386-387 
397 403-404 414-416 421 425-427 
429-430 433-436 440 444 451 454 
465-466 472 475 478 491 493 516 
520 523 525 531 545 548 552 566 
569-570 581 583 590-591 597-598 
601 616-617 621 641 650 652 656 
659 671 674-675 679 684 710 718- 
719 728 734 737-738 742 761 765 
774-778 790 811 814 818 B30 834- 
836 854-855 859 866 869 871 878- 
879 884 889 892 904 922-923 932 
990 992 998 1001 1004 1016 1036 
1042 1048 1051 1054-1055 1058 
1088-1089 1106 1112-1114 1155 
1157 1192 1200 1223 1227-1228 
1236-1237 1260-1251 1282-1283 
1285 1287 1295 1314 1317-1321 
1324-1327 1330 1333 1341 1343 
1347 1350 1353 1355-1357 1367 
1369-1370 1373 1377 1379 1381 
1383-1384 1394 1397 1400 1406 
1413 1417 1425-1427 1438 1442 
1446 1459-1460 1470 1493 1505 
1521 1536 1546-1549 1560 1573- 
1574- 1578 1598-1600 1621 1626 
1631 1634 1646 1649 1653 1656 
1658 1669-1670 1683-1684 1687- 
1688 1690-1693 1696 1699 1702 
1704 1707-1709 1711 1720 1722- 
1723 1725 1727 1729 1731-1733 
1738-1740 1743-1746 1752 1755 
1760-1761 1767 1777 1781-1782 
1786 



bone marrow 



Clontech 



BMD004 



73-74 503 922 1036 1711 



bone marrow 



Clontech 



BMD007 



95-95 866 1320 1475 



adult colon 



Invitrogen 



CLN001 



17 56-58 103 110 117 144 150 171 
179 185 188-189 201 204-206 210 
218-221 225-226 231 237 251 277 
288 310 312 320 333 359 386 388 
394 408 420 455 481 485 503 510- 
512 590-591 615 635 647-648 665 
672 684 697 710 725-726 743 780 
786 788 826-827 848-850 854-855 
858 866 872 898 918 921-923 953 
976 983 993 1005-1006 1017 1020 
1025 1027 1054-1055 1063 1068- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 1320 
1345 1351 1355 1369 1428 1439 
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Tissue Origin 



Mixture of 16 
tissues - 
mRNAs 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1462-1464 1512 1556 15B3 1587 
1594 1596 1614 1625-1626 1631 
1639 1645 1650 1675-1677 1687- 
1688 1701 1713-1714 1724 1740 
1765 



Various 
Vendors 



CTL016 



401 1490 1686 



16 



Mixture of 
tissues - 
mRNAs* 
adult cervix 



Various 
Vendors 



CTL021 



312 782 1132-1133 1403 1712 1715 



BioChain 



CVX001 



1 4-8 11 13 18-21 25-26 30-31 33 
37-39 43 46-47 58 61 64-66 71 
73-74 82 85 94 100 103-104 113 
118 122 126 130 134 140 147 153- 
156 163 170 179 181 186 192 195- 
196 198 201-202 218-219 222 229- 
231 257 266 276-277 285-286 288 
298 301-302 304 307 312-314 324 
326 329-330 332 335 342 352 358 
362 371-372 376 379 381-382 384 
388 398 400 410 414 416 419-420 
426-427 430-431 433-436 439 446 
448 461-462 464 471-477 479 4B2- 
483 491 493 496 503 506 510-513 
516-517 526 530 535 542-544 546- 
547 S57 561 572-573 575-577 581- 
582 585-586 588-589 593-594 600 
602 604-605 607-609 612 615-619 
623 644 650 654 657-658 662-665 
670 672 680 683 691-694 698 706 
708-709 711 713 720-721 727 729 
731-732 737 745-747 753-754 760 
765 771 774-777 780 790 793 796 
798 800 803 805 818 826 828 831- 
832 834-836 843 847-848 851-855 
857-860 864-866 869 871 876 878- 
B80 882 887 890-891 897 899-902 
905-908 912-913 916 918-919 922 
927 932 934-938 944 948 955-956 
958 963-964 967 969-970 972 976 
978-979 983 985 990 992 1000 
1005-1007 1016-1017 1024 1027 
1033 1036 1038 1045 1047 1053- 
1056 1066-1067 1071 1073 1075 
1079 1082 1098 1113 1124 1129 
1134 1139 1146-1149 1163 1167 
1170 1173 1175 1177 1181 1197 
1200 1202 1211 1214 1216 1221- 
1222 1225 1227 1232-1234 1240- 
1241 1243 1258 1264-1265 1268 
1270 1279 1287-1290 1308 1310- 
1311 1316 1320 1323 1327 1345 
1349 1353-1354 1360 1372-1374 
1383-1384 1386 1394 1397 1405- 



* The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain 
mRNA (Invitrogen), 2) normal adult kidney mRNA (Invitrogen), 3) normal adult liver 
mRNA (Invitrogen), 4) normal fetal brain mRNA (Invitrogen), 5) normal fetal kidney 
mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA 
(Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA 
(Clontech), 10) human leukemia Jymphablastic mRNA (Clontech), 11) human thymus 
mRNA (Clontech), 12) human lymph node mRNA (Clontech), 13) human spinal cord 
mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA 
(BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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Tissue Origin 


RNA Source 


Hyseq 








SEQ 


ID NOS: 








Library Name 






















1406 1416 1425- 


1427 


1431 1436- 








1437 1442 1446 


1448 


1453 1459 








1466 1472 1478 


1482 


1496 1501- 








1503 1506 1512 


1522 


1527-1528 








1531 1533 1541 


1547 


1569 1571 








1585 1589 1597- 


1598 


1600 1608- 








1609 1614-1616 


1620 


1623-1624 








1626-1628 1630 


1638 


1641 1643 








1649 1653 1656 


1662 


1667 1669 








1674-1675 1683 


1685 


-1688 1699 








1702 1709-1710 


1715 


1717 1722 








1724 1729 1 


731- 


1732 


1735-1739 








1741 1743-1744 


1748 


-1749 1755 








1760-1762 1767 


1773 


1778 1785- 








1786 












diaphragm 


BioChain 


DIA002 


137 


282 


289 


730 


780 


986 


1409 








1478 1599 1 


S14 








endothelial 


Strategene 


EDT001 


3 5- 


-10 


13 15-21 


24- 


26 29 34 37- 


cells 






39 4 


12 44-45 


50- 


51 53-55 


57-58 








60-61 65-66 


68- 


69 73-74 


77-78 80 








82-83 85 87 


89 


93-96 101-105 108 








110 


112 


-114 


116 


118 


-122 


124 128 








133- 


134 


137 


-142 


147 


-150 


152-153 








161- 


163 


166 


-172 


176 


-179 


187 190 








192 


194 


196 


-201 


204 


-207 


210 212- 








214 


220 


224 


229 


-230 


233 


235-236 








240- 


241 


251 


-252 


258 


261- 


£t O £* D _J 


• 






267- 


269 


272 


276 


-277 


279- 


281 284- j 








285 


288 


290 


295 


-296 


301- 


302 310- 








311 


313 


316 


321 


325 


329 


331-333 








335 


340 


342 


351 


-355 


360 


371 375 








3B0- 


382 


384 


397 


390 


392 


397 400 








407- 


408 


410 


412 


414 


416 


425-427 








431 


434 


-436 


439 


444 


-445 


449 454 








4 63- 


464 


472- 


-475 


477- 


-479 


486 4HR- 








490 


497 


-498 


500 


-504 


510- 


513 516- 








519 


522 


524 


526 


-528 


532- 


534 536- 








540 


542- 


-546 


548 


561- 


•563 


566-567 








572- 


576 


579 


581 


585- 


-586 


589 593 








595 


597 


599 


603 


607- 


-612 


615-617 








620 


622 


626 


630 


632- 


-634 


638-641 








644 


647 


656- 


-660 


662- 


-664 


670 673 








678 


680-682 


692 


-697 


707 


709-710 








712- 


713 


719 


730 


732 


734 


736 738 








743- 


746 


751 


759 


768 


771 


773 775- 








778- 


783 


786- 


789 


793 


800 


803 805- 








907 


810- 


-811 


814 


816- 


•818 


821-822 








824 


826 


828- 


829 


832 


834- 


838 842- 








845 


848-850 


854 


-860 


852 


864 869 








871 


874 


876-879 


883 


885 


887 890- 








891 


894-895 


898^900 


903 


908 910- 








913 


916 


919- 


922 


924 


926- 


928 930- 








935 


939 


943 


948-949 


951- 


954 957 








959- 


961 


964 


969- 


-970 


973 


975-978 








983- 


984 


988- 


990 


992- 


993 


996-997 








1000 


1002 1004-1013 


1016 


-1020 








1022 


-1025 1028 1031 


1033 


-1034 








103B 


-1046 1050 1055- 


1056 


1059- 








1060 


1062-1064 1067- 


1070 


1072- 








1074 


1076 1078 1082 


1086 


-1087 








1089 


-1090 1093-1097 


1099 


-1103 








1107 


1109-1113 1116- 


1117 


1124- 








1126 


1128-1131 1134- 


1135 


1138 








1140 


1144-1145 1148- 


1149 


1153 








1157 


116 


0 1163 1171 


1183 


-1184 








1198 


-119 


9 1202 1205- 


1207 


1211 








1216 


-121 


7 1219 1221 


1225 


1229 








1232 


-123 


5 1238-1241 


1243 


-1244 i 








1246 


1250 1253 1257- 


1258 


1261 
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Tissue Origan 



RNA Source 



Hyseq 
Library Name 



SSQ ID NOS: 



1265-1266 1268 1270-1271 1274- 
1277 1280-1283 1285-1286 1288- 
1290 1293 1295 1298 1308 1312 
1317-1320 1324-1325 1327 1329- 
1330 1334-1335 1338 1342-1343 
1345-1347 1350 1355-1356 1359 
1367 1369 1374 1376 1379 1398 
1400 1406 1408 1414 1417 1419 
1424-1426 1428-1431 1434-1438 
1440-1442 1448 1450 1462-1466 
1468 1472 1474 1478 1487-1488 
1491-1493 1501-1504 1506 1509 
1511 1516 1520-1521 1526 1529 
1531 1536-1537 1539-1540 1546- 
1547 1549 1552 1555 1557-1559 
1561-1565 1568 1571 1575 1578- 
1579 1581-1583 1587-1588 1590 
1592 1597 1605-1606 1611 1613 
1615 1618-1621 1624-1628 1630- 
1631 1634 1636 1638 1641 1643- 
1650 1652-1659 1664 1666-1667 
1669 1671 1675-1681 1683-1688 
1696-1698 1703 1711 1715-1716 
1719 1722-1723 1726 1731-1733 
1736 1739-1741 1743-1744 1749 
1755 1760-1761 1765 1767-1768 
1771-1773 1776 1779 1783-1786 



Genomic clones 
from the short 
arm of 
chromosome 8 



Genomic DNA 
from 
Genetic 
Research 



EPM001 



286 
1411 



686 12 
1412 



97 1303-1304 1352 
1754 



esophague 



BioCham 



ES0002 



131-132 261 289 380 503 860 892 
1000 1007 1397 



fetal brain 



Clontech 



FBR001 



62-63 89 112 126 194 322 336-338 
379 391 411 481 546 563 607 679 
710 867 1012 1031 1055 1251 1262 
1320 1407 1643 1652 1686 1731- 
1732 1746 1765 



fetal brain 



Clontech 



FBR004 



6B-69 90-91 139 212-213 301 331 
362 374 403 436 611 645-646 659 
668 670 691 785 805 845 1163 
1209 1216 1232-1233 1238-1239 
1387 1410 1416 1430 1496 1536 
1547 1593 



5-9 25 43 60 62-63 65-66 70 72 
80 87 92 101 103 108 114 136 139 
149 152-153 157 168 171-172 175 
207-208 210 212-213 221-226 237- 
238 251-253 266 272 279-281 295 
301-302 307 310 317-318 321-324 
330 333-334 336-338 346-347 352 
357 370 373 377 379-380 382 384 
391-392 397 399 402 406-408 410- 
411 417 421 424 426-427 430 436- 
437 440-443 454 460 464 467 473 
476 483 488-489 495 497 508 510- 
513 516 519-520 524 530 537-540 
544 547 550 561 567 572-574 582 
590-591 595 597 604 607-609 615 
623 628-629 631 634 638-640 655 
657-658 660 665 669 674-675 679 
689 691-694 696-697 699 701 706 
710 716 720 728 732 734 736 742- 
744 757-760 763 775-778 780 799 
806-807 810 817-818 826 839 843 
858 861 864 871-872 864 890-891 
894-895 898 904 915 921-923 935- 
936 938 945 950 952 955-956 958- 
959 961 963 967 969-971 990 992 



fetal brain 



Clontech 



FBR006 
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Tissue Origxn 



fetal brain 
fetal brain 



RNA Source 



Clontech 
Invitrogen 



Hyseq 
Library Name 



FBRS03 



SEQ ID NOS: 



999 1001 
1016 1022 
1035 1042 
1065 1067 
1114-1115 
1151 1153 
1172-1173 
1190-1200 
1226-1227 
1253-1255 
1270-1273 
1314 1317 
1339 1341 
1371 1373 
1386 1392 
1425-1426 
1440-1441 
1502-1503 
1519 1536 
1559 1573 
1611-1614 
1640 1651 
1693 1696 
1718 1720 
1730-1733 
1742 1745 
1767 1771- 
1786 



1005 
1024 
1047 
1070 
1119 

-1156 
1178 
1211 
1229 
1258 
1281 
1320 
1344 
1376 
1396 
1428 
1448 
1507 
1544 
1589 
1619 
1657 
1703 
1722 
1735 
1755 
1772 



1006 

1029 
-1048 
1082 
1131 
1160 
1184 
1216 
1231 
1260 
1287 
1326 
1350 
1379 
-1398 
1429 
1466 
1511 
1549 
-1590 
1621 
1658 
1704 
1724 
-1736 
1759 
1777 



1008 
1030 
1052 
1089 
1143 
1163 
1186 
1222 
1236 
1262 
1308 
1334 
1356 
1381 
1419 
1432 
1470 
1513 
-1550 
1598 
1625 
1676 
1713 
1726 
173B 
1761 
1779 



1013 
1032 
1056 
1109 
1149 
1167 
1188 
1223 
1245 
1266 
-1309 
-1335 

1369- 
-1382 
1423 
1437 
1482 
1516 
1557- 
1608 
1626 
-1679 
1714 
1728 
1739 
1765 
1780 



235-236 520 864 1068 1188 1587 



FBT002 



fetal heart 
fetal kidney 



Invitrogen 



FHR001 



15-18 20-21 24-25 29 34 43 61-63 
77-78 98 101 103 107-108 128 130 
136 146 148 165-166 171 174 181 
185 196-198 204-205 208 223 230 
235-236 251 253 261 268-269 280- 
281 284-285 288 309-311 321 329 
334 339 346-347 350 357-359 381- 
383 390 407 418-419 430 434-435 
438 443-444 461 464-466 483 490 
494 509 516 519 522 527 557 561- 
562 572-573 590-591 595 597 623 
632 647-648 650 655 669-670 672 
682 690-691 700-701 710 717 736 
746 782 784 788-789 814-815 825 
829 840-841 847 854-855 857-858 
897-900 904 919 925 935-937 946 
948-949 954 960-962 966 969-970 
986 996 1000-1001 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 1078 1082 1085 
1090 1109 1115 1118 1120 k 1128 
1136-1137 1144-1145 1149 1156- 
1157 1193-1195 1198 1204-1205 
1220 1222 1234 1257 1262 1271 
1274-1275 1280 1285-1286 1294 
1312 1314 1317-1320 1330 1342 
1344-1345 1349-1350 1355-1356 
1358 1364 1369 1379 1383-1384 
1431 1435 1476 1507 1519 1532 
1536 1547 1554 1564 1567 1578 
1582 1587 1593 1595 1601 1608 
1615 1619-1621 1638 1644 1661 
1665-1666 1673 1687-1688 1690 
1715 1723 1728 1749 1753 1757 
1759-1761 1765 1771 1774 1776 
1778 1781-1782 1786 



105 124 180 289 864 1036 1148 
1229 1614 1616 1762 1785 



Clontech 



PKD001 



5-8 11 40 47 57 65-66 82 85 102 
124 163 171 216 222 224 235-236 
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Tissue Origin 


RNA source 


Hyseq 






SEQ 


ID NOS: 








Library Name 




















258 


277 


280-281 


307 


310 


314 330 








371 


387 


392 395 


403 


422 


-423 431 








436 


443 


455 469 


500 


519 


522 542 








563 


572 


-573 585 


600 


619 


623 650 








654 


657 


-658 660 


679 


719 


731 780 








798 


821 


833 844 


854 


-855 


857 864 








868 


878 


911 929 


958 


960 


969 990 








992 


1007 1046 1087 


1103 


1129 








113 


9 1285 1312 


1331 


1355 1369 








1371 1376 1391 


1422 


1425-1426 








144 


0-1441 1470 


1543 


1598 1601 








161 


8 1631 1651 


1654 


-1655 1669 








167 


8-1679 1691- 


1692 


1733 17B5 


fetal kidney 


Clontech 


FKD002 


352 


384 


426-427 


440 


583 


602 1060 








113 


1 1324-1325 


1636 






fetal kidney 


Invitrogen 


PKD007 


20- 


21 82 163 33 


5 679 988-989 








1000 1227 1230 


1320 


1554 


fetal lung 


Clontech 


FLG001 


35- 


36 94 323 371 39 


8 426-427 445 








473 


549 


560 604 


616 


-617 


626 631 








649 


651 


719 746 


786 


-787 


832 842 








849 


-850 


864 894 


-895 


1075 1178 








1182 1200 1206 


1309 


1311 1345 








1429 1493 1567 


1576 


1620 1686 


fetal lung 


Invitrogen 


FLG003 


9 15-16 


29 41 4 


7 68 


-69 83 88-89 








102 


124 


137 152 


-153 


165 


196 224 








229 


231 


249 254 


256 


267 


291-292 








300 


325 


333 344 


-345 


352 


373 376 








379 


384 


408 426 


-427 


430 


432 467- 








468 


475 


483 488 


493 


516 


531 535 








545 


547 


549 564 


582 


602 


623 644 








660 


662- 


-664 670 


673 


725-726 728 








761 


766-767 774 


805 


830 


852-853 








864 


875 


921 932 


937 


946 


949 963 








988-989 


1014 1016-1017 1024 1027 








1090 1097 1170 


1185 


1200 1215- 








1216 1224 1258 


1290 


1309 1320 








1342 1347 1355 


1369 


1381 1413- 








1414 1431 1438 


1449 


1491 1512 








1536 1547 1557- 


1560 


1567 1590 








1601 1636 1644 


1653- 


•1655 1662 








1667 1671 1675 


1680- 


•1681 1706 








1739 1760-1761 


1769 






fetal lung 


Clontech 


FLG004 


103 


276 


334 465 


-466 


737 


843 1131 








1614 1658 








fetal liver- 


Columbia 


FLS001 


3-11 13 


15-21 25 30-39 41-48 50- 


spleen 


University 




51 54 56-58 60-66 68-69 


72 7S 








77-80 82-83 85 87 89 92-103 105- 








110 


112 


116-124 


126- 


127 


130 133 








135-139 


141 144 


147- 


149 


152-153 








157 


163-165 167-172 


174 


176-178 








180 


186 


188-190 


193- 


194 


196 198- 








200 


202-206 210 


-214 


219 


221-231 








233- 


236 


240-244 


246- 


247 


250-251 








255- 


256 


258 261- 


-265 


268- 


269 272 








274 


276- 


278 280-281 


284- 


286 288 








293 


295 


299-301 


304 


306- 


307 309 








311 


314 


316 318 


320- 


321 


326 329- 








332 


342 


344-345 


350 


352- 


353 356- 








358 


360 


362 370- 


•374 


376 


378-384 








386- 


387 


390 392- 


■393 


400- 


401 403 








406 


408 


410-412 


415 


417 


419 422- 








437 


439- 


442 444- 


•445 


448 


452-454 








456 


459 


461-470 


472- 


479 


481-483 








487- 


468 


490-491 


493 


500- 


501 503- 








506 


509- 


513 515- 


•520 


522- 


524 526- 








529 


531 


534 536- 


540 


542 


547-549 








553- 


554 


561-562 


564 


567- 


568 571- 








576 


579 


581 583 


585- 


597 


599-605 1 



123 



WO 01/53312 PCT/US00/34263 



Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



796 798 800- 
810-819 821- 



607 610-613 615-621 623-624 626 
628-634 636-640 644 647-650 655- 
660 665 669-670 672 674-675 678 
681-682 684 690-695 697 702 708- 
710 713-714 716-719 725-728 730- 
731 734 736 738 740-741 743-746 
748 750-751 759-766 768 772 7<74- 
777 779 783-788 793 
805 808 810-812 814 
824 826-832 834-837 843-847 849- 
867 869-876 878-883 887 889-895 
897-898 902 904-914 916 919 921- 
928 930-937 939 945-950 953-958 
960-961 963-965 967 969 971 974- 
978 980-983 986 988-990 992-993 
995-997 1000-1002 1004-1008 1012 
1014 1016-1019 1025-1026 1028- 
1031 1033 1035-1036 1039-1044 
1047 1049-1050 1053-1056 1058- 
1059 1061-1064 1067-1070 1072- 
1074 1076 1078 1082 1085-1087 
1089-1090 1097 1099-1103 1107- 
1113 1115-1119 1121-1123 1125 
1127-1128 1131-1134 1136-1137 
1144-1150 1153 1159-1160 1163 
1170 1175 1177-1178 1188 1190- 
1192 1195-1200 1202 1206 1208- 
1211 1214 1216 1218 1221-1222 
1225 1227 1234 1237 1241 1244 
1246-1247 1251 1254 1258 1261 
1266 1268 1270-1273 1277-1282 
1284-1285 1287-1290 1294 1299- 
1300 1306-1308 1313-1320 1324- 
1325 1327 1330 1332-1333 1338 
1341 1343 1345-1347 1349-1350 
1353-1360 1362-1363 1365-1367 
1369-1370 1372-1374 1376 1378- 
1381 1383-1384 1386 1389-1391 
1400 1402-1403 1405-1410 1413 
1415 1417-1419 1422-1429 1431 
1435-1437 1439-1442 1445-1446 
1448-1449 1454 1458-1459 1466- 
1470 1472 1474 1477-147B 1480 
1482 1485 1491-1493 1496-1498 
1501-1507 1509 1S11-1512 1516- 
1519 1524-1526 1529 1532 1536- 
1541 1546-1547 1549-1550 1552- 
1554 1562 1564 1569 1572 1574- 
1575 1578 1581 1583 1587-1588 
1591-1592 1594-1595 1597-1598 
1600-1604 1611-1612 1614-1615 
1617-1618 1620-1622 1624-1625 
1627-1628 1630-1632 1634-1639 
1645-1651 1653-1662 1664 1667- 
1669 1671 1673-1674 1676-1688 
1690 1696 1701-1703 1706-1709 
1711 1713-1714 1718-1719 1722 
1724-1727 1731-1733 1738 1740- 
1741 1743-1744 1746 1748 1751- 
1752 1754 1760-1765 1767-1773 
1780 1783-1786 



fetal liver- 
spleen 



Columbia 
University 



FLS002 



3-11 13 15-21 26 29 32 35-39 42 
44-45 48 50-51 54-55 57-58 61 64 
68-69 73-75 78 80 82 84 87 9S-98 
100 103 105 107-108 110 112-113 
116-119 122-125 128 130 137-138 
145 147-153 155 157 159 161-163 
166 168 171-172 174-175 177 181 
188-189 193-194 196-198 200-203 
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Tissue Origin 



SEQ ID NOS: 



RNA Source 



Hyseq 
Library Name 



206 212-215 219-221 223 225-229 
231-232 240-244 246-247 250-251 
258-259 262 264 268-269 272 275 
277 280-281 284 286 288 290-292 
295 298-299 301-304 306 308-310 
318 320-321 323 325 329 331 334 
342 348-349 352-353 356 359 368 
371 374 376-379 381-384 386-387 
392-393 397-398 400-401 403 410- 
413 421 423 426-427 429-430 433- 
436 438 440 443 445 448 451-452 
454-455 460-463 465-467 469 471- 
473 475-476 478-479 481-483 487 
490-491 493-494 497 500-501 SOS- 
SOS 509-513 515-517 519-520 524 
526-531 534 537-542 544 547 552- 
554 556 558 561-562 564-567 571- 
577 583-587 590-591 593 595 597 
601 604-606 608-613 616-617 619- 
624 626-632 634 637-642 644 647 
649-652 654-659 662-665 669-672 
674-675 661-632 685 688 690 696 
698 700-703 707 709-710 713 717 
719-721 723-724 728 731-732 734 
737-738 742-745 748 752 754 759 
763-766 768 770 773-777 780 782 
784 786 791 795-798 801-802 805 
808 811-812 818 823-824 826-827 
832 834-837 839 843 846 848-B56 
858-861 865 867 869 871 873-874 
876 878 881-882 887 889 892 894- 
898 901-902 904 906-908 913-915 
919 921-924 926-932 934-935 937 
939-941 943 946-947 950 953 958 
961 965-967 971 973-975 977-979 
981 984-985 990 992-993 995-997 
999 1001 1004-1007 1009-1011 
1013 1016 1020 1023 1025 1027- 
1031 1033-1035 1039-1042 1044- 
1045 1049 1053 1055-1056 1058- 
1059 1062 1064-1065 1067-1070 
1072-1074 1079 1082 1087 1089 
1093 1097 1099-1103 1105-1107 
1109-1114 1123 1125-1127 1132- 
1134 1140 1143-1145 1148-1150 
1156 1158 1160 1163 1172-1173 
1177-1178 1181-1184 1190-1192 
1195-1197 1199 1204 1206 1208 
1211 1214 1216 1219 1227 1230 
1234-1235 1237 1240-1241 1243 
1245 1247 1256 1258 1260-1261 
1264 1268 1270-1271 1275 1278- 
1279 1284-1286 1288-1289 1299- 
1301 1306 1308 1312 1314 1317- 
1319 1323-1325 1327-1330 1334- 
1335 1339 1343-1347 1349-1350 
1354-1355 1357 1360 1362-1363 
1365-1367 1369 1372 1376 1378- 
1380 1386 1389-1391 1394 1400 
1403 1406 1409 1416-1419 1422- 
1427 1429 1435 1437-1438 1440- 
1442 1446 1448-1450 1453 1460- 
1461 1468 1470 1472 1474-1475 
1478 1482 1486 1490-1493 1496 
1498 1500-1504 1506 1508-1509 
1511-1512 1516 1518-1519 1521 
1524-1528 1531 1536-1538 1543 
1547 1550 1554 1556 1564 1567- 
1569 1580 1587-1588 1591-1592 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



fetal liver- 
spleen 



Columbia 
University 



1597- 

1618 

1641 

1661- 

1676- 

1691- 

1713- 

1727 

1744 

1763- 

1776 



1598 

1628 

1646- 

1662 

1679 

1692 

1714 

1730- 

1748- 

1764 

1779 



1600 
1630 
1649 
1664 
1683 
1699 
1717 
1733 
1752 
1767 
1783 



•1631 
1652 
1667- 

-1684 
1702 
1719 
1738 
1758 
1769 

•1786 



1611- 

1635- 

1654- 

1669 

1686- 

1707 

1722 

1740 

1760- 

1772- 



1612 

1638 

1659 

1674 

1688 

1711 

1726- 

1743- 

1761 

1773 



FLS003 



fetal liver 



103 300 318 321 352 372 379 381 
384 392-393 403 422 424 429 434- 
435 440 444 453 503 515 544 592 
978 1064 1324-1325 1327 1333 
1357 1369 1378 1418 1424 1622 
1646 1S49 1680-1681 1689-1690 
1717 1743-1744 1769 



Invitrogen 



FLV001 



fetal .liver 
fetal liver 



15-16 26 34 58 61 64 70 75 78 89 
98 105 112 116 120-121 123 133 
151 166 176 180 194-196 198 200 
204-206 210-211 220 225-226 230 
235-236 239 247 259 261 267 272 
277 280-281 303 310 313 317 320- 
321 329 344 356 371 374 376 379- 
382 395 408 412 414 419 429 434- 
435 441-442 465-466 490 494 504- 
506 509 522 527 534 552-553 562 
567 569-570 572-574 607 631 657- 
658 667 669 672 685-686 702 717 
725-726 732 748 759 761 778 784 
786 809 817 829 837 857 861 872- 
873 875 881 889 894-895 909 911 
916 954 963 967 974 977 986 988- 
989 993 995 997 1000 1005-1006 
1008 1014-1015 1020 1042-1043 
1070 1086-1087 1089-1090 1118- 
1119 1122 1144-1145 1148 1153 
1157 1159 1183 1195-1196 1227 
1250 1257-1258 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
1344-1345 1349-1350 1355 13.62- 
1363 1403 1405 1415 1419 1425- 
1426 1429 1431 1442 1448 1463- 
1464 1469-1470 1489 1528 1536 
1539 1549-1550 1557-1562 1577 
1583 1598 1601 1611 1615 1622 
1644 1649 1666 1674 1706 1721 
1738 1746 1763-1765 1774 1776 
1779 



ClonteciT 
CI on tech 



FL.V002 



676 998 1719 



FLV004 



fetal muscle 



93 133 214 301 355 374 379 555 
581 601 679 837 847 859 1123 
1236 1270 1313 1324-1325 1327 
1355 1367 1425-1426 1536 1690 
1733 1760-1761 



Invitrogen 



FMS001 



26 37-39 50-51 58 84 86 89 98 
113 12B 131-132 139 155 172 186 
194 198 201 206 211 230-231 256 
261 276 282 286 302 325 359 361 
376 379 383 398 412-413 419 430 
436 448 452 462-463 473 477 503 
519 529 561 569-570 590-591 597 
607 623 626 635 647 660 672 715 
725-726 730 733 761 775-777 788 
826 837 860 874 913 915 921 935 
970 980 986 988-990 992 1000- 
1001 1007 1014 1027 1035-1036 
1045 1060 1064 1070 1083 1097 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1099- 

1173 

1266 

1324- 

1383- 

1433 

1557- 

1632 

1712 

1766 



1102 
1198 
1270 
1325 
1384 
1505 
1559 
1644 
1725- 



1116- 

1208 

1277 

1329 

1399- 

1514 

1562 

1650 

1726 



1117 

1228 

1298 

1336- 

1400 

1542 

1589 

1652 

1743- 



1121 

1240 

1317- 

1337 

1403 

1551 

1599 

1671 

1744 



1164 
1258 
1320 
1369 
1409 
1554 
1620 
1675 
1754 



fetal muscle 



Invitrogen 



FMS002 



fetal skin 



Invitrogen 



119 221 273 402 426-427 463 547 
599 736 869 1000 1033 1083 1266 
1431 1440-1441 1468 1545 1599 
1673 1678-1679 1687-1688 1710 
1712-1714 1723 1725 1731-1733 
1743-1744 1760-1761 1767 



FSK001 



1 4-11 15-16 20-23 
43 46 56-57 60-61 6 
97-98 105 107-108 1 
123 133 135-137 139 
151-153 156 163 170 
189 197-198 200 202 
222 231 246-247 261 
277 285-286 290 293 
311 321 325 328 330 
341 345 351-352 355 
362 368 370 372 376 
388 394 404-405 408 
419-420 424 426-427 
445 448-449 454 462 
476 490 493 504 506 
519 526 531 537-540 
561 567 572-573 581 
612 615 623 630-631 
651 657-658 660 662 
672 676 678 681 688 
709-710 713 717 720 
728-729 732 748 750 
766 770 775-777 780 
789 798 809 811 814 
824-826 831 842 857 
864 881 894-895 908 
918 922-923 928 932 
946 948-949 953 960 
970 975 977 986 990 
1000 1004 1007 1013 
1027 1032 1035 1041 
1057-1058 1060 1062 
1072 1077 1090-1091 
1103 1108 1113 1119 
1131 1134 1140 1148 
1153 1156 1163 1167 
1189 1192 1195-1196 
1205 1208 1211-1212 
1220 1222 1225 1240 
1266-1267 1274 1277 
1285 1299 1310 1317- 
1325 1329-1330 1342 
1349-1351 1354-1357 
1369 1371 1373 1376 
1383-1384 1387 1399- 
1410 1427 1429 1431 
1439-1441 1448-1449 
1468 1470 1472 1475 
1487 1490-1491 1493 
1512 1521 1525-1526 
1536 1547 1549 1557- 
1592 1595 1597-1598 
1604 1608 1611 1614 



25 29 33 40 
4-66 75 82 87 
13 118-119 
144 146 148 
176 180 188- 
-203 210 218 
263 265-270 
299 301 307 
333-335 339 
-356 358-359 
379-382 384 
-409 411-412 
436 441-442 
465-466 472 
509 515-517 
547 549 560- 
584 589 611- 
635 647 649 
-665 667 669 
701 704-705 
-721 725-726 
753 759 764 
-781 786 788- 
816-817 822 
859 861 863- 
910-911 916 
-933 935 937 
-961 966-967 
992-993 999- 
1018 1025 
■1043 1054 
■1064 1069 
1097 1099- 
1123 1128 
1149 1152- 
1178 1182 
1198 1201- 
1216 1219- 
1243 1258 
1280 1282- 
1322 1324- 
1344 1346 
1365-1366 
1378 1380 
1400 1405 
1433-1435 
1454 1457 
1480-1481 
1498 1509 
1529 1535- 
1559 1588 
1601 1603- 
1618 1624- 
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Tissue Origin I RNA Source 



Hyseq 
Library Name 



SEQ XD NOS: 



fetal skin 



1626 1632 1634 1636 1641 1643- 
1644 1646 1654-1657 1660-1662 
1665 1668 1675 1685 1687-1689 
1702-1703 1709-1710 1716 1719 
1724 1727 1731-1732 1737-1740 
1742 1747 1749 1755 1760-1761 
17S5 1772 1776-1777 1779-1780 
1786 



Invitrogen 



FSK002 



13 286 302 307 313 321 330 335 
339 341 354 370 372 385 400 402 
408 414 426-427 433 4.36 450 454 
515 544 585 598 767 810 845 939 
1076 1109 1155 1317-1320 1326 
1333-1335 1343 1347 1350 1369- 
137i 1377-1378 1391 1397 1422 
1466 1647 1656 1678-1679 1687- 
1688 1693 1718 1721 1725 1731- 
1732 1739 1755 



fetal spleen 



BioChain 



FSP001 



umbilical cord 



BioChain 



FUC001 



110 137 211 353 589 927 1108 
1639 1771 



fetal brain 



4-8 10 12 14 17 33-36 44-46 57 
64 68-69 75 82 85 101 104 113- 
114 116 119 122-124 133 137 153- 
154 157 161 163 166-167 175 181- 
184 186 192 197-198 200-202 212- 
215 230 234 246-247 251 256 263 
267 271-272 280-281 284 295 301 
314 317 321 326 333-335 345 351 
356 368 371-373 379-380 386 390 
392 394 406 408-410 412 414 416 
420 424 427 430-436 438 444-446 
454 459 461 463 467 473 482-483 
486 488 490 495 504 509 524 526 
537-540 547 555 561 574-577 588- 
591 593 606 615 620-621 632 637 
645-647 650 659-660 662-664 667- 
668 674-675 684 687 696 698 701 
703-705 709 711 714 719-720 725- 
727 732 749-750 762 765 771 775- 
777 780 789-791 793 796 802-803 
814-817 822 833 843 845 S48 858 
861 864 875 879 888 894-895 897- 
900 903 906-907 911-912 925 930- 
933 936 940 948 953 960 966 977 
984 990 992 998 1000-1001 1005- 
1007 1016 1023 1025 1037 1046- 
1047 1059 1061-1063 1073 1076- 
1077 1089 1094-1097 1112-1113 
1115 1134 1144-1148 1151 1154 
1156 1163 1171 1197 1204-1205 
1208 1216 1218 1224 1234-1235 
1243-1244 1246 1279 1283 1286- 
1287 1298 1316 1320 1344 1346 
1350 1357 1359 1371 1373 1375 
1381 1398 1400 1403 1408 1414 
1424 1427-1428 1431 1433 1440- 
1442 1446 1454-1455 1479 1482 
1484-1485 1489 1492-1493 1504- 
1505 1513 1525 1527 1536 1538 
1546 1565 1567 1571 1573 1575- 
1576 1578-1579 1591 1595 1600- 
1601 1608 1612 1615 1621 1624 
1626 1636-1637 1647-1648 1651 
1653 1656 1658 1661-1662 1672 
1675 1682 1684 1686-1688 1690 
1709-1710 1722 1727 1729 1735- 
1738 1740-1741 1760-1761 1768 



GIBCO 



HFB001 



4 9 11-13 17-18 22-23 25 37-39 
42-47 50-51 54-55 58 60-61 65-66 
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Tissue Origin 


RNA Source 


Hyseq 
Library Name 






SSQ 


ID NOS: 










72 75 77 80 


82 


85 90-91 


94 100- 








102 


107 


110 


112 


-116 


118- 


•119 122- 








123 


126 


128 


134 


136 


-140 


147-148 








153- 


155 


157 


161 


165 


169- 


•172 175 








181 


186 


188 


-189 


197 


-198 


204-206 








208 


210 


215 


222 


-223 


225- 


226 230 








235- 


238 


240 


-241 


247 


253 


256-258 








260- 


262 


267 


-269 


276 


279- 


281 284 








286 


289 


298 


300 


-302 


307 


310 318 








321- 


323 


325 


330 


-331 


339 


341 346- 








349 


352 


354 


356 


-359 


362 


364-365 








371- 


372 


377 


379 


-380 


382 


384 387 








390 


400 


408 


414 


-416 


419 


424 431 








434- 


435 


438 


441 


-443 


449 


451 453- 








455 


457- 


-463 


470 


472 


-473 


475 477- 








478 


482- 


-483 


486 


-488 


490- 


491 493 








496 


499- 


•500 


502 


-504 


506- 


507 509- 








512 


515 


519- 


-520 


522 


525- 


526 529- 








530 


537- 


-540 


543 


-544 


546- 


547 566- I 








567 


569- 


•570 


572 


-582 


585 


588 590- 








591 


593 


595 


599 


601 


604 


606-609 








611- 


612 


614- 


•620 


622 


-624 


630 632 








636 


643 


645- 


•647 


650 


-652 


654 659 








661 


665 


667- 


•668 


670 


-672 


676 678 








681 


687 


689 


692- 


-694 


697 


699 710 








714 


717 


721 


727 


729 


-732 


734 736 








738 


743-746 


750-751 


759 


763 766 








770 


772 


775-777 


784 


7B9 


791 796 








799 


802-805 


810-811 


814 


819-821 








824 


826 


830 


834- 


-837 


839- 


850 854- 








856 


858- 


860 


862 


864 


869 


871 876- 








877 


879 


883 


886-887 


890- 


891 893- 








895 


898- 


901 


905 


908 


-910 


912-916 








919 


922- 


923 


925 


927 


930- 


933 935- 








938 


948 


952- 


960 


963-964 


967 969- 








972 


975 


978- 


979 


981 


983 


986-987 








990 


992 


995 


997 


999-1002 


1005- 








1009 


1011-1013 1016 


1018 


-1019 








1023 


1026 1029-1031 


1033 


-1035 








1038 


1041 1047 1050 


1053 


1057 








1059 


1064 1068 1070 


1072 


-1073 | 








1078 


-1079 1081-1082 


1086 


1089 








1094 


1097 1103 1107- 


-1109 


1113- 








1115 


1121-1122 1127 


1134 


-1135 








1138 


1140 1143 1148- 


•1151 


1153 








1156 


-1157 1159 1167 


1170 


1175 








1193 


-1194 12 


00 1202 


1207 


-1209 








1211 


1216 1219-1220 


1226 


-1227 








1229 


1232-1234 1240- 


1241 


1243 








1246 


124 


9-1251 1253- 


1254 


*1258 








1267 


-1268 1271 1276 


1279 


12B2 








1285 


-1289 12 


93-1294 


1305 


1307- 








1308 


1312 1316 1320 


1327 


1338- 








1339 


1341-1344 1346 


1349 


1355- 








1357 


1359 1365-1366 


1369 


-1370 








1373 


-1375 13 


79 1386 


1389 


1394 








1398 


1409 1413-1414 


1416 


-1417 








1420 


-1421 1425-1427 


1430 


1433 








1437 


143 


9 1442 1445- 


1452 


1454- 








1457 


1459 1463-1464 


1468 


1470 








1474 


1477-1479 1489 


1492 


1494 








1497 


-1498 1501-1503 


1507 


1509 








1511 


-1513 1517 1520- 


1521 


1524- 








1526 


1531-1533 1535 


1537 


-1538 








1547 


1554 1556-1559 


1564 


-1567 








1571 


1584 1587 1589 


1594 


1599- 








1601 


1611-1612 1614- 


1616 


1619- 








1620 


1625-1628 1630- 


1631 


1634 








1637 


-1638 1640-1643 


1645 


1648- 
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RNA Source 


Hysecj 








SEQ 


ID NOS: 










Library Name 
























1649 1651 1653- 


1655 


1657-1658 








1664-1665 1667 


1669 


1673 16 


78- 








1679 1683-1684 


1686 


1693 1701 








1704-1705 1709 


1713 


-1714 1717- 








1720 1724 1 


727- 


1728 


1731-1733 








1737-1738 1743- 


1744 


1752 1754- 








1755 1757 1 


760- 


1761 


1765 1772 








1779 17 


35 












mac r opnage 


Invitrogen 


HMPOQl 


5-8 


110 


204 


-205 


503 


634 


678 


859 








878 


933 


988 


-989 


1379 1448 1504 


infant brain 


Columbia 


IB2002 


10 12-13 15 


-18 


22-23 25 


29 


34 




University 




37-39 43 47 


50- 


51 54-56 


58 


60-63 








65-66 6 


3-69 


72- 


74 80 82- 


-83 


86 








88-92 97 10 


0 102-104 106-108 110 








112- 


113 


115 


-116 


118 


123 


128 


13 0 








134-136 


138 


-139 


143 


147-149 


151- 








152 


154-155 


163 


165 


-167 


169 


172- 








175 


181-184 


186 


193 


-196 


198 


201 








203- 


205 


209-210 


214 


-215 


222 


224- 








226 


231-232 


235 


-236 


239 


246 


-247 








252 


257 


260 


268 


-269 


272 


276 


-277 








279- 


281 


286 


288 


291 


-292 


295 


298 








300- 


301 


304 


307 


310 


313 


321 


-323 








330- 


331 


333-334 


339 


346- 


347 


349 








352 


356-357 


362 


371 


-372 


377 


379- 








380 


383-384 


392 


397 


401 


406 


408 








411 


413-414 


416 


418 


-419 


422 


428 








430- 


431 


434 


-435 


438 


443 


449 


453- 








454 


461 


464-466 


469 


-470 


472 


-473 








475- 


476 


478 


482 


-483 


487 


490 


492 








494 


497 


503 


507 


-508 


510- 


513 


516 








519- 


520 


524-526 


530 


-534 


536 


-540 








547 


550- 


-551 


561 


563-564 


566-567 








572- 


576 


579 


581 


-582 


584- 


587 


590- 








591 


593 


595- 


•597 


607 


-609 


611 


-613 








616- 


617 


620 


622 


-624 


627 


631 


637 








641 


645- 


•647 


650 


-655 


657- 


658 


660- 








665 


667- 


675 


689 


691 


695 


697 


699 








703 


707 


713- 


-715 


717 


721 


728- 


-731 








733- 


736 


739 


743 


745 


751 


755 


759 








763 


769- 


•770 


772 


778 


780- 


781 


785 








7B8- 


789 


793- 


794 


799 


803 


808 


811 








814 


825- 


826 


830 


834- 


-836 


840- 


-843 








845 


848- 


850 


854 


-855 


860 


862 


864- 








865 


870 


872 


875 


-876 


878 


886 


888 








890- 


891 


894- 


896 


898 


903- 


904 


916- 








917 


919 


922- 


925 


927-928 


930- 


932 








934- 


936 


938 


941 


945- 


■946 


948- 


•950 








953- 


954 


959- 


962 


966- 


969 


977 


979 








981 


986- 


990 


992 


997 


999- 


1000 








1004 


-1006 1014 1016 


1018 


-1019 








1024 


-1025 1033 1036 


1047 


1051- 








1052 


1054-1055 1057- 


1059 


1063- 








1064 


1068-1070 1073 


1081 


-1082 








1085 


1089 1108-1113 


1118 


-1120 








1123 


-1124 1130 1132- 


1138 


1140 








1149 


1151 1153-1154 


1163 


-1170 








1172 


1174-1175 1183- 


1184 


1188 








1190 


1193-1194 1196- 


1197 


1199 








1204 


1208-1209 1211 


1218 


-1222 








1226 


-1227 1229 1231 


1234 


1241 








1247 


1249 1251 1256 


1258 


1261- 








1262 


1269 1274 1279 


1281 


1283 








1285 


1287-12B9 1294- 


1295 


1305 








1307 


1313-1314 1316- 


1320 


1329 








1332 


1341-1342 1345 


1349 


1356 








1362 


-1363 13 


65-1366 


1368 


-13 70 








1374 


1381 13 


83-1384 


1388 


1400 








1403 


1406-1407 1413 


1417 


1420 
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infant brain" 



Hyseq 
Library Name 



SEQ ID NOS: 



1423 

1441 

1454 

1468 

1483 

1499 

1522- 

1542 

1555 

1580 

1593 

1610 

1624 

1639- 

1654- 

1672- 

1693- 

1717- 

1733 

1755- 

1777- 



1429 
1443 
■1455 
1470- 
1485 
1502- 
•1523 
1546- 
1563 
1583- 
1595 
1612 
1626- 
1640 
1655 
1673 
1695 
1720 
1735- 
1758 
1778 



•1431 
1447 
1457 
•1471 
1493 
1503 
1525 
1547 
1565- 
1586 
1598 
1614- 
1627 
1642 
1658- 
1676- 
1701- 
1723- 
1741 
1762 
1786 



1435- 
-1449 
1459 
1475 
1494 
1505- 
1528 
1549- 
•1567 
1588 
1600- 
1616 
1630- 
1644 
1659 
1681 
1702 
1724 
1743- 
1765 



-1436 
1451- 
1463 
1479 
1496 
•1507 
153l : 
1550 
1569 
1590 
1601 
1619 
1633 
1647 
1664- 
1685- 
1704 
1726- 
1744 
1771 



1439- 
•1452 
•1465 
1482- 
1498- 
1509 
1533 
1554- 
1575 
1592- 
1608- 
1621 
1637 
1652 
1665 
1688 
1708 
1728 
1752 
1774 



Columbia 
University 



IB2003 



infant brain 



infant brain 



Columbia 
University 



17-18 20-23 29 34 43 60 68-69 
78-80 88 100-101 107 110 -112 118 
123 128 133 135-137 146 148 152 
159 166 169 174 194 198 203 215 
223 225-226 229 235-236 247 260 
27G-2B1 286 290-292 295-300-301 
310 322 324 331 334 339 346-347 
349-350 352 357 371 376-377 382 
384 403 408-409 414-415 453-455 
472 476 478-479 490 503 507 516 
520 530 534 536-540, 551 563 572- 
576 585 587 590-591 593 595-596 
601 606 612 616-617 620 622-624 
650 652-653 661 665 670-671 674- 
675 678 689 715 717 727-728 730 
734 759 775-777 780-781 785 796 
806-807 811 824 845-846 864 869 
875 882 889 894-895 898 904 917 
919 921-923 932 935-936 946 950 
954 962 977 979 997 999-1000 
1005-1006 1009 1011 1017 1024 
1033 1037 1043 1055 1057 1109 
1114-1115 1120 1123 1127 1144- 
1145 1149 1151-1153 1160 1167 
1170 1174 1193-1194 1196 1199 
1202 1206 1209 1220-1221 1226 
1229 1240-1241 1251 1258 1284 
1283-1289 1305 1314 1327 1333 
1344 1347 1350 1356-1357 1365- 
1366 1378-1379 1388 1400 1403 
1421 1423 1431 1436 1440-1441 
1446-1447 1457 1459 1471 1499 
1503 1507 1509 1536 1546 1557- 
1559 1567 1572 1587 1595 1598 
1610-1612 1615 1631 1639 1644 
1647 1657-1658 1673 1678-1681 
1683-1684 1701-1702 1708-1709 
1713-1714 1719 1757 1760-1761 
1765 1771 1778 



Columbia 
University 



IBM002 



IBS001 



101 113 139 152 260 279 290-292 " 
374 377 551 563 608-609 653 659 
814 954 1005-1006 1029-1030 1130 
1164 1209 1258 1294 1305 1320 
1327 1397 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 
1779 



10 12 119 175 279-281 321 334 
371 446 551 563 623 652 667 669 
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Tissue Origin 


RNA Source 


Hyseq 


SEQ 


ID NOS: 










Library Name 


















671-672 819 949 966 


1113 1130 








1151 1188 1193- 


1194 


1196 1229 








1258 1265 1271 


1287 


1317-1319 








1324-1325 1342 


14 23 


1440-1441 








1448 1471 1482 


1525 


1532 1546 








1562 1569 1588 


1591 


1610 1618 









1647 1649 1658 










lung , 


Strategene 


LFBOOl 


5-9 17 20-21 2S 


68- 


69 82 94 


105 


ElDxODiaSL 






153 157 197-198 


203 


207 


-208 


212- 








213 223 262 266 


283 


302 


321 


326 








333 356 370 427 


430 


436 


446 


462 








472 493 498 503 


516 


519 


527 


535 








537-540 542-544 


562 


•565 


567 


586 








599-600 607 615 


630 


647 


662 


-664 








692-694 712 719 


745 


748 


775 


-777 








794-796 810 837 


843 


-847 


849 


854- 








856 869 876 903 


934 


953 


955 


-956 








964 975-976 984 


1000 1005-1007 








1024-1025 1033 


1039 


1053 1064 








1070 1072 1082 


1112 


-1113 1134 








1136-1138 1140 


1195 


1223 1232- 








1233 1246 1279 


1285 


1295 13 


11 








1320 1334-1335 


1343 


1427-14 


28 








1446 1478 1482 


1493 


1504 1537 








1552 1555 1567 


1575 


1582 1598 








1620 1625 1632 


1638 


1645 16 


54- 








1655 1662 1680- 


1681 


1684 1686 








1690 1696 1702 


1711 


1733 1741 








1760-1761 1778 


1785 








lung tumor 


Invitrogen 


LGT002 


5-10 18 20-21 29 33 


-36 40 43 52 








54-55 61 65-66 


68-70 73 


-75 


30 85 








88-89 93-94 100 


103 


106 


-108 


112- 








113 115-116 118 


-119 


123 


-124 


126 








130-132 135-137 


139 


-141 


143 


-144 








147-148 151-153 


155- 


-156 


159 


161 








164 169 171 179 


-180 


185 


190 


192 








194 196-199 203 


-208 


210 


212 


-214 








216-217 219 222 


233 


240- 


-241 


244 








246 251-252 255 


-256 


261- 


-262 


266. 








272 276-277 279 


-281 


284 


286 


288 








290 295 298 301 


-302 


309-312 


317 








321 329 332 341 


-342 


344- 


-345 


34B 








352 358-360 363 


368 


370-371 


376 








380-381 384 389-390 


398 


400 


409 








414 423 426-427 


430 


432- 


■436 


443- 








444 450-451 454 


462 


468 


472- 


477 








480-483 487-488 


490- 


491 


493 


496- 








498 500 503-506 


509- 


512 


515-516 








519 521-523 526 


530 


534 


541 


544 








547 554 557 564 


566- 


567 


572- 


576 








585-586 588-589 


595- 


596 


601 


607 








611-612 615 619 


621 


623 


626 


630 








632-633 644 647 


649 


651 


655- 


656 








660 662-665 667 


669 


672 


683- 


684 








696 700 706 710 


713 


716 


718- 


719 








722-723 728 734-739 


743 


750 


752 








763 765-766 773-778 


784- 


785 


787- 








789 791 800 802- 


803 


809- 


812 


814 








824 826 628-829 


832 


838- 


839 


841- 








845 849-850 852- 


855 


857- 


861 


864 








866 874 878-880 


882 


887 


890- 


891 








897-898 902 904 


906- 


907 


910 


916 








918-920 922 924- 


925 


927 


930- 


932 








934-935 937 947 


950 


953 


955- 


956 








961 963 966-967 


969 


971 


977- 


979 








981 984 986-987 


990 


992- 


993 


995 








997 999-1001 1005-1007 1 


009 










1012-1013 1018 1020 


1022 


-1024 






1 1026 1029-1030 1033 


1038 


1041 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ 


ID NOS: 








Library Name 




















1045 


1047 


-1050 


1052 


1054 


-1055 








1059 


1063 


-1064 


1067 


-1071 


1073- 








1074 


1078 


1085 


1087 


1089 


1095- 








1097 


1104 


1106-1107 


1109 


1112 








1116 


-1117 


1119 


1126 


1134 


-1135 








1139 


1141 


-1142 


1144 


-1145 


1148 








1152 


-1153 


1156- 


-1158 


1167 


1170 








1172 


1178 


1195-1196 


1198 


-1200 








1202 


12 04 


1208 


1214 


1216 


1219 








1222 


1227 


1234 


1241 


1247 


1252 








1257 


-1258 


1265 


1267 


-1270 


1276 








1278 


1280 


-1281 


1283 


1285 


1288- 








1289 


1295 


1300 


1305 


1308 


1312 








1317 


-1321 


1329 


1338 


-1339 


1341 








1344 


-1346 


1349- 


-1351 


1353 


-1355 








1357 


1365 


-1366 


1369 


1378 


-1379 








1383 


-1385 


1394 


1397 


1400 


1402- 








1403 


1408 


1417 


1419 


1423 


-1426 








1431 


1433 


-1436 


1438 


1444 


1446- 








1448 


1454 


-1455 


1460 


1466 


1468 








1470 


1474 


1480- 


1481 


1483 


1486- 








1488 


1490 


-1491 


1494-1496 


1506 








1508 


-1509 


1511- 


1512 


1515 


-1516 








1519 


1523- 


-1524 


1528 


-1529 


1536- 








1540 


1546 


1549- 


1550 


1555 


1560- 








15S1 


1565 


1567 


1569 


1575 


1588 








1591 


1593- 


-1594 


1596-1598 


1600- 








1602 


1608 


1614- 


1616 


1618 


1620 








1624-1625 


1627- 


1632 


1636 


1639 








1644-1645 


1647- 


1649 


1652-1653 








1656-1662 


1664 


1666-1667 


1670- 








1671 


1673-1675 


1678-1679 


1683 








1685-1688 


1690- 


1692 


1696-1699 








1705 


1709 


1716- 


1717 


1722 


1727 








1730 


1735 


1739 


1741 


1743-1744 








1748-1749 


1753 


1760-1762 


1765 








1767 


1770-1771 


1773 


1775-1776 








1778-1779 


1786 








lymphocytes 


ATCC 


LPCOOl 


4 11-12 18 24-25 30-31 48 50-51 








56-57 68-69 80 


92 98 103 


105 110 








126 137 152-153 


157 


165 172 188- 








189 197 203 210 


217- 


•218 222-223 








225-226 229 231 


247 


251 256 264 








272 280-281 284 


300-301 321 325- 








326 339 34B 352 


357 


371 382 384 








390 400 404 412 


414 


421 423 426- 








427 430-431 445 


447- 


448 451 454- 








455 475 503 516 


526- 


•527 530 537- 








540 549 556-560 


563 


574 577 589 








602 613 615-617 


621 


623 628-630 








636-637 647 649 


657- 


659 690 697 








717 723 755 764 


775- 


777 780 786 








789-790 793 800 


802 


822 838 849 








866 869 876 881 


-883 


892 898 906- 








907 911 92 


1-923 


928 


975 990 992 








996 1001 1004-1007 1033 1050 








1054 


1078 


1107 


1135 


1140- 


1141 








1143 


1148 


1158 


1163 


1177 


1199 








1205 


1216 


1226 


1231 


1236 


1241 








1244 


1250 


1258 


1260 


1265 


12S9- 








1271 


1290- 


1293 


1308 


1312 


1317 








1319- 


1320 


1339 


1345- 


1346 


1348 








1350- 


1351 


1357 


1367 


1369 


1379 








1381 


1383- 


1384 


1386- 


1387 


1389 








1394 


1397 


1405 


1423 


1425- 


1428 








1431 


1437 


1446 


1448 


1461 


1466 








1470 


1472 


1474 


1482 


1492 


1506 








1528 


153 7 


1546 


1549 


1591 


1598 








1600 


1603- 


1604 


1606 


1627 


1636 
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Hyseq 
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SEQ ID NOS: 



leukocyte 



1638 1647-1649 1651 1658-1659 
1664 1676-1677 1680-16B1 1687- 
1688 1699 1711 1715-1716 1726 
1728 1737 1740 1746 1748 1752 
1756 1758 1777 1779 



GIBCO 



LUC001 



3-4 10-11 13 15-18 20-21 24-25 
30-31 35-36 40 43-45 48 50-51 
54-58 60-63 68-69 75 79-80 82-83 
85 88-91 93-96 98 100 103-104 
107-108 112 116 119 123 125-128 
134-140 142 147-149 151 153 155 
157 162-163 167 169-172 174 177- 
179 186 190 192-199 203-207 210 
212-215 217-219 222-223 229 235- 
236 247 251 255-258 260 262 272 
274-277 280-281 285-286 297-301 
307-310 313-314 316-317 321 325- 
330 333-334 340-342 348-349 352 
354-358 370-371 380-385 387-388 
400 405 408-410 412 414-416 421- 
425 430-431 434-435 437 439 441- 
442 445-451 453-454 456 459 461- 
464 468-472 474-479 481 483-485 
487-491 496 499-501 503-504 509- 
513 516-519 522 526-527 529-531 
534 536-540 542 547-549 553-559 
566-567 571 574-577 579 582 584- 
586 589 593 595-597 601-602 604 
606-607 611-613 615-621 623 627- 
629 633 636-637 642 644-650 655 
659-660 662-665 667 669 674-675 
678 682-684 692-696 698 700 706 
708 710 716-720 725-726 729-736 
738-739 743-746 749 751 753 756 
759 765-766 768 770-778 780 784- 
786 788-790 793 796 798 800 802- 
803 810-811 814 817 819 826 828- 
830 832 834-836 838 843 845-860 
863-864 866-871 877-879 881-892 
894-896 898 902 904-914 916 919- 
925 927 930-932 935-936 941-942 
945 948-949 953 955-956 958 960- 
952 964 967 970-971 973 975 977 
985-990 992-993 995-996 999-1002 
1004-1009 1011 1014 1017-1019 
1022-1023 1025 1027 1029-1031 
1033-1036 1038 1041 1043 1047 
1050 1053-1054 1058-1059 1061- 
1062 1064 1068 1070 1072 1078 
1085-1086 1089-1091 1093 1097 
1106-1107.1110-1113 1115-1117 
1122-1123 1125 1129 1132-1133 
1135-1137 1140-1145 1152 1158 
1163 1168 1170-1174 1176-1178 
1180 1182-1183 1186 1195 1198- 
1200 1202 1205-1206 1211 1216 
1219-1221 1223-1227 1230-1236 
1238-1242 1247 1252 1254 1256 
1258 1261-1262 1264-1265 1269- 
1270 1272-1275 1277 1280-1284 
1287-1293 1299-1300 1306 1308 
1312-1313 1317-1320 1322 1324- 
1330 1333-1335 1339 1341 1343- 
1347 1349 1353-1357 1359-1361 
1365-1367 1369-1370 1373-1374 
1377 1379-1381 1386-1387 1394 
1400 1403 1409 1419 1423 1425- 
1428 1430-1431 1433-1434 1437- 
1438 1440-1442 1446-1448 1450 
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SEQ 


ID NOS: 








Library Name 




















1453 


1458 


-1459 


1463- 


-1464 


1468 








1470 


-1471 


1474 


1477- 


-1478 


1482- 








1488 


1490 


-1493 


1496- 


-1501 


1504 








1506 


1509 


1512- 


1513 


1516 


1519 








1521 


-1522 


1524- 


1525 


1527 


-1528 








1531 


1534 


1538 


1541 


1545 


-1547 








1549 


-1550 


1553 


1555-1556 


1560 








1565 


1567 


1575 


1580 


1589 


1591 








1594 


1596 


1598 


1600- 


-1602 


1606- 








1608 


1611 


1614 


1620- 


-1621 


1624 








1626 


-1629 


1631- 


1632 


1636 


1638- 








1639 


1641 


1644- 


1645 


1648 


-1650 








1653 


-1655 


1658- 


1660 


1662 


1669- 








1670 


1675 


-1679 


1684-1688 


1690- 








1692 


1696 


1700 


1702 


1707 


-1709 








1711 


1716 


-1717 


1720 


1723 


1725- 








1727 


1733 


1737- 


1738 


1741 


1743- 








1744 


1748 


-1749 


1752 


1755 


1760- 








1762 


1765 


1769 


1771- 


-1772 


17B1- 








1784 


1786 










leukocyte 


Clontech 


LUC003 


4 35 


-36 44-45 61 68- 


-£9 7 


5 82 102 








119 


139 154 179 


197 


244 


280-281 








324 


372 404 430 


-431 


455 


461 476- 








477 


481 503 537 


-540 


554 


575-576 








581 


589 608-609 


621- 


•622 


624 630 








632 


647 662-664 


669 


679 


698 764 








773 


775-777 802 


848 


851 


B56-857 








879 


905-907 915 


949 


952 


990 992 








1002 


1113 


1119 


1170 


1183 


1216 








1236 


-1237 


1241 


1275 


1346 


1353 








1357 


1359 


1377 


1506 


1515 


1534 




* 




1553 


1591 


1600 


1613-1614 


1621 








1628 


1670 


1676- 


1677 


1691 


-1692 








1699 


1733 


1738 


1772 






melanoma from 


Clontech 


MEL004 


25 35-36 43 80 


104 126 128 150 


cell line ATCC 






163 


166 188-189 


197 


210 


215 220 


#CRL 1424 






271 


277 280-281 


310 


317 


336-338 








345 


351 372 380 


-381 


383 


387 412 








415- 


416 430 445 


448 


454 


456 467 








481 


490 499 503 


526 


528 


546 548 








567 


575-576 588 


601 


613 


615 647 








660 


665 734-735 


737 


759 


778 787 








790 


800 832 845 


856 


859 


869 878 








883 


887 905 914 


932 


934 


958 976 








985 


990 992 999 


-1000 1025 1031 








1038 


1050 


1055 


1068 


1074 


1088 








1099 


-1102 


1107 


1136- 


-1138 


1149 








1156 


1163 


1172 


1190 


1195 


1200 








1214 


-1215 


1217 


1226-1227 


1235 








1238 


-1239 


1244 


1253 


1278 


1280 








1293 


1311 


1320 


1330 


1334 


-1335 








1345 


1355 


1367 


1386- 


1387 


1394 








1403 


1406 


1414 


1423 


1437 


1442 








1465 


1521 


1529 


1536 


1539 


1541 








1547 


-1548 


1582 


1620 


1626 


1631 








1638 


164 7 


1653 


1660 


1667 


1669- 








1670 


1680- 


•16B1 


1696 


1704 


1715 








1724 


-1725 


1731- 


1732 


1750 


1760- 








17S1 












mammary gland 


Invitrogen 


MMG001 


5-8 


10 12 


14-18 


20-21 24 


-25 29 








33-3 


9 42-43 52 


55-58 60- 


64 68-69 








71 73-74 79-BO 


82 89 98 


100 103 








106 


108 112 123 


128 


133- 


137 144- 








146 


148 150-152 


154 


158- 


159 165- 








166 


170-172 174 


176 


178 


181-185 








188- 


190 194-198 


201- 


206 


210 217- 








222 


224 227-228 


231 


233- 


237 247 








251 


253-254 256 


261- 


263 


266-267 








271 


276-277 279 


-281 


284- 


286 288 
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SEQ 


ID NOS: 








Library Name 




















290 


297 2 


99 30 


1 304 


309- 


312 318 








320- 


321 3 


23-325. 327 


-329 


331-332 








334 


339 341 344-345 


348 


350 356 








359- 


360 3 


62-363 368 


371 


376 379- 








383 


388 3 


90 393-395 


397- 


398 405 








406 


412 4 


14-415 423 


430 


434-437 








441- 


444 448 451-455 


462- 


464 474 








476 


479 4 


82 485-486 


486 


490 494- 








495 


498 503 506 509 


-512 


516-517 








519- 


520 522 527 529 


534 


537-541 








547 


549 554 557 562 


572- 


574 587 








589- 


591 597 602 607 


618 


623 628- 








629 


632 634-640 644 


647- 


648 650- 








652 


655 657-658 660 


665 


667 669- 








672 


674-676 679 682 


688 


695-696 








706- 


707 710 713 717 


720 


722-730 








732- 


734 736 738 743 


747- 


748 750 








755 


759 761 766 770 


780 


784 786- 








789 


794 6 


03 806-807 


809 


814 817- 








822 


827-829 837 842 


854- 


858 863- 








864 


866 869-870 872 


878 


881 889 








393- 


900 904 906-907 


911 


916 919 








921- 


923 926 935-937 


946 


948-949 








953- 


954 957 960-961 


963 


965-966 








970 


977-978 984-989 


993- 


997 








1000 


-1001 


1005 


-1006 


1008 


1013- 








1014 


1016 


-1017 


1023 


1025 


1027 








1032 


-1033 


1036 


1039 


1043 


1045 








1055 


1057 


-1058 


1063 


1068 


-1075 








1077 


-1078 


1085 


1087 


1089 


-1091 








1095 


-1102 


1107-1108 


1112 


-1119 








1121 


-1123 


1131- 


-1133 


1136 


-1137 








1139 


-1142 


1144-1145 


1148 


-1149 








1153 


1159 


1167 


1170 


1172 


-1173 








1183 


-1185 


1190-1192 


1196 


-1199 








1207 


-1208 


1212 


1216-1218 


1222- 








1223 


1225 


1231 


1234 


1240 


-1241 








1247 


1253- 


-1254 


1258- 


-1259 


1261- 








1262 


1270- 


-1280 


1283 


128S 


-1286 








1298 


1307 


1314 


1316- 


■1320 


1323- 








1325 


1330 


1334-1335 


1342 


-1345 








1349 


-1352 


1354- 


•1355 


1359 


1369- 








1370 


1377 


1379 


1381 


1383 


-1384 








1389 


1405 


1414 


1419 


1421 


-1423 








1425 


-1426 


1428- 


1429 


1431 


1434- 








1437 


1439 


1448- 


1449 


1454 


1457 








1460 


-1464 


1466 


1471 


1480 


-1483 








1487 


1489-1491 


1493 


1505 


1507 








1512 


1519 


1526- 


1528 


1532 


1534 








1536 


1539 


1542 


1547 


1549 


-1550 








1554 


1561-1562 


1564 


1567 


1572 








1576 


-1579 


1581- 


1582 


1587 


-1588 








1592 


1594 


1596- 


1597 


1601 


-1602 








1607 


-1608 


1610 


1612- 


1616 


1618 








1621 


-1622 


1625- 


1626 


1631 


1635- 








1636 


1641 


1643- 


1644 


1647 


1650 








1652 


1654- 


1655 


1657- 


1658 


1660 








1662 


1664- 


1666 


1669- 


1671 


1673- . 








1674 


1676- 


1677 


1680- 


1685 


1689- 








1692 


1701 


1706 


1713- 


1715 


1719- 








1720 


1723- 


1728 


1730- 


1732 


1738 








1740 


1742- 


1744 


1746- 


1747 


1749 








1751 


1753 


1760- 


1762 


1765- 


-1768 








1771 


1774 


1776- 


1777 


1779 


1783- 








1784 


1786 










induced neuron 


Strategene 


NTD001 


29 35-36 8 


0 116 


123 


156 163 181 


cells 






214 230 28 


0-281 


284- 


285 307 321 








330 340 358 371 


375 


377 380 382 








422 424 492 497 


532- 


533 542 546 
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Tissue Oragin I RNA Source 



retinoid acid 

induced 
neuronal cells 



neuronal cells 



pituitary 
gland 



Strategene 



Strategene 



Clontech 



Hyseq 
Library Name 



NTR001 



NTU001 



PIT004 



SEQ ID NOS: 



549 566 586 595 612 
734 775-778 780 792 
856 858 875 936 953 
1041-1043 1055 1072 
1194 1206 1223 1246 
1288-1289 1291 1294 
1349 1359 1412 1423 
1623 1645 1684 1705 



645-647 654 
799 821 826 
985 990 992 
1104 1193- 
1253 1274 
1311 1320 
1485 1620 
1715 1751 



5-8 78 268-269 277 383 431 506 
623 677 731 999-1000 1199 1425- 
1426 1547 



29 65-66 80 82 110 119 146 152 ~ 
166 174 181-185 198 227-228 253 
284 309 325 332 334 336-338 375 
391 393 406 414-416 454 465-466 
470 488 503 506 510-512 519 537- 
540 572-574 597 602 607 623 647 
661 700 702 716 743 771 792 858 
904 948 954 977 1000 1005-1006 
1025 1064 1068 1122 1148 1185 
1219 1226 1234 1246 1271 1283 
1295-1296 1311 1317-1320 1329- 
1330 1350 1355 1365-1366 1378 
1383-1384 1400 1412 1445 1505 
1539 1547 1578 1647 1656 1683 
1690 173B 1749 1783-1784 



311 314 379 408 419 430 454 1055 
1095-1096 1272-1273 1312 1320 
1378 1652 1671 1720 1725 1736 
1741 1755 



placenta 



Clontech 
Clontech 



PLA003 



5-8 124 208 277 370 843 906-907 
1280 1317-1319 1369 1609 1621 
1737 



prostate 



PRT001 



rectum 



Invitrogen 



9 46 57 71 107 147 171 177 197 "' 
201 229 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 505-506 513 521 526 
531-533 547 618 649 657-658 662- 
664 710 729 767 771 789 820 861 
871 874 890-891 905 938 945 963- 
964 988-989 1002 1025 1033 1045 
1061 1095-1096 1112 1125 1142 
1196 1198 1202 1232-1233 1241 
1258 1272-1273 1287 1295 1313 
1333 1341 1344 1349 1360 1362- 
1363 1367 1437 1442 1447 1475 
1478-1479 1482 1489 1513 1517 
1527 1531 1536 1598-1599 1628 
1636 1657 1680-1681 1687-1688 
1717 1738 1743-1744 



REC0 01 



17-18 29 33 62-63 71 73-74 83 86 
113 126 146 153 158 167-169 195 
200 206 261 309 312 341 344 368 
373 388 395 408 414 420 430 441- 
442 446 448 464 468 483 517 537- 
540 S47 567 585 589 602 623 628- 
629 632 645-647 651 657-658 669 
717-719 721 725-726 738 748 750 
756 762-763 766 770 774 790 819 
825 843 849 851 881 903 909 948- 
949 960 986 996 1020 1023 1033- 
1034 1064 1067 1070 1075 1086 
1108-1109 1113 1130 1139 1153 
1159 1172 1178 1185 1187-1189 
1205 1220 1225 1240 1244 1271 
1317-1320 1323 1334-1335 1350- 
1351 1355 1369 1373 1375 1425- 
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RNA Source 


Hyseq 


SEQ ID NOS: 






Library Name 










1426 1436 1439 1469 1474 1477 








1482 1546 1587-1588 1592 1596 








1610 1622 1627 1644 1658 1662 








1665-1666 1669 1675-1677 1749 


T" =5 3- 






1786 


salivary gland 


Clontech 


SAL001 


10 55 97 103 110 140 149 152 158 








198 217-218 242-243 256 301 308 








312 321 333 351 354 360 410 437 








448 473 487 494 496 501 535 555 








569-570 572-573 590-591 624 636 








651 759 762 764 768 771 788 800 








809 826 848 865 879 906-907 925 








933 963 1016 1020 1025 1040 1046 








1055 1066 1103 1150 1172 1181 








1234 1281-1282 1288-1289 1298 








1315 1320 1333 1336-1337 1346 








1359 1373 1379 1424 1447 1449 








1474 1482 1492 1494 1498 1511 








lbzj-1524 1537 1554 1596 1626- 








1627 1636 1652-1655 1658 1665 








lb /1-lb /z 1691-1692 


salivarv alanrt 


C~\ r^Ti t~ o r" V* 
' — LUIlLcLIl 


CRT em 


158 326 1423 1463-1464 


sTcIn 




C*T> rt rt ^ 

SrBQQl 


1320 1400 












ATCC 


SFBO02 


262 736 1025 1253 










sjcin 


ATCC 


SFB003 


709 1119 1350 1631 1653 


riorooiasc 






small 


Clontech 


SIN001 


25 142 146-147 151 155 198 203 


intestine 






244 260 271 280-281 286 288 298 








301-302 308 312 334 340 371 398 








408 412 414 416 423 426-427 430 








434-435 445 452 454 478 503 516 








519 521 523 543 547 549 555 559 








563 569-570 585 592 604 611 626 








628-629 632 650 659 681 710 714 








718 750 764 780 798 829 842 857 








859 866 887 892 894-895 901 904 








906-907 912 919 935 997-998 1000 








1007-1008 1026-1028 1044 1055 








1089 1097 1116-1117 1131 1148 








1169 1199 1219 1234 1247 1264 








1279 1316 1320 1326 1341 1343 








1349 1351 1374 1387 1398 1400 








1403 1407 1423 1428 1468 1498 








1501 1521 1550 1556 1585 1597 








1636 1638-1639 1645 1653 1656 








1662 1671 1675 1684 1691-1692 








1704 1711 1717 1719 1722 1725- 








1726 1729 1733-1734 1743-1744 








1762 1767 1780 1785 


bA.olcLdJ. 


Clontech 


SKM001 


18 20-21 82 84 101 118 134 148 


muscle 






151 153 166 225-226 258 274 277 








289 329 361 412 414 424 440 452 








459 470 488 503-504 537-540 647 








660 673-675 715 773 780 786 830 








905 922 950 963 982 990 992 1020 








1047 1063 1115-1117 1121 1134 








1228 1268 1284 1298 1321 1329 








1336-1337 1343 1409 1413-1414 








1509 1599 1624 1644 1653 1712 


skeletal 


clontech 


SKM002 


168 1683 1712 


muscle 








skeletal 


Clontech 


SKMS03 


235-236 1409 


muscle 








skeletal 


Clontech 


SKMS04 


235-236 


muscle 








spinal cord 


Clontech 


SPC001 


4 9 11 17 30-31 35-36 43 46 60 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



82 85 92 94 108 110 
167 198 204-205 210 
259 277 280-281 300- 
317 372 379 387 392 
430 433 448 467 473 
509 513 519 524 526 
547 549 551 559 567 
607 616-617 623 625 
652 657-658 670-671 
682 709 711 715 719 
749-750 753 775-777 
809 820 832 834-836 
855 858 861 864 871- 
898 906-908 917 919 
944 970 985 990 992- 
1039 1053 1059 1065 
1077 1082 1085 1097 
1116-1117 1128 1134 
1174 1192-1194 1215 
1243 1283 1294 1307 
1323 1327 1330 1350 
1356 1359 1368 1375 
1407 1423 1429 1437 
1454 1470 1482 1492 
1511 1529 1538 1548- 
1571 1578 1598 1600 
1627 1630 1639 1646 
1670 1686 1696 1740 
1771 



116 139 157 
215 229 256 
302 304 315 
419 426-427 
487 489 506 
537-540 543 
569-570 593 
637 649-650 
673 679 681- 
726-729 734 
781 789 791 
847-849 854- 
872 875 884 
924 934 942 
993 998 1013 
1072 1075 
1103 1109 
1151 1170 
1225 1241 
1312 1320 
1353-1354 
1400 1406- 
1443 1448 
1501 1508 
1549 1565 
1614 1625 
1651-1652 
1751 1755 



adult spleen 



CI ontech 



SPLcOl 



stomach 



117 312 326 348 424 426-427 431 
845 866 1320 1330 1333 1344 
1355-1357 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



CI ontech 



STO001 



thals 



10 15-16 61 68-69 100 117 149 
197 201 227-228 231 249 273 280- 
281 287 291-292 302 312 358 362 
426-427 430 446 462 475 479 535 
597 620 630 651 662-664 722 739 
780 782 785 846 919 960 964 966- 
967 976 1008 1012 1032 1042 1063 
1071 1135 1170 1208 1234-1235 
1259 1277 1280-1281 1322 1349 
1359 1369 1449 1468 1474 1478 
1487 1493 1498 1557-1559 1622 
1634 1651 1653 1729 



CI ontech 



THA002 



9 11 25 85 87 112 137 146 180 
190 198 206 210 212-213 235-236 
239 261 268-269 279 290 301 325 
333-334 341 351 356 364-365 379 
388 393 396 419-420 441-442 4S8 
477 483 508 525 531 549 567 606 
608-609 647 681 715 725-727 736 
774 782 784 794 827 883 890-891 
899-900 961 997 999-1001 1004 
1034 1055 1097 1129 1144-1145 
1150-1151 1157 1172-1173 1177 
1193-1194 1208 1220 1249 1280 
1305 1345 1355 1369 1434-1435 
1440-1441 1454 1496 1546 1549 
1562 1572 1578 1590 1594 1613- 
1614 1640 1651-1652 1671 1687- 
1688 1703 1743-1744 1746-1747 
1753 



Clontech 



44-45 54 57-58 62-64 79 104 123 

126 134 153 193 212-213 218 242- 

243 258 274 277 279 297 301 307 

327 330 333 342 351 358 371 410 

430 445 465-466 46B 471 483 487 

493 503 506 509 517 526 535 537- 



thymus 



THM001 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



540 546 548 554 567 584 586 590- 
591 604 612 621 638-640 645-647 
649 656 660 665 670 698 710 720 
72Q 735 739 746 759 762 766-767 
775-777 780 784-785 800 802 809 
824 826 828 845 851 858-859 864 
866 870-871 878 884 887 892 899- 
900 927 930-931 967 983 986 990 
992 999 1014 1029-1030 1033 1059 
1066 1073 1103 1107 1113 1116- 
1117 1119 1140-1142 1158 1163 
1172 1177 1195 1206 1209 1213 
1216 1218-1219 1221-1222 1227 
1271 1277 1282 1320 1329 1349 
1367 1369 1383-1384 1417 1419 
1423 1425-1427 1448 1477 1488 
1493 1536 1554 1620 1644 1646 
1649 1654-1655 1661-1662 1669- 
1670 1674 1676-1677 16B5-1688 
1707 1711 1731-1732 1737 



thymus 



Clontech 



THMcQ2 



5-9 15-21 25 33 35-36 43-45 48 
50-51 54-55 60 75 83 87 89 93 
98-100 102 105 112 117 135-137 
141 143 146 157 167 169 192 196 
211 217-219 222 224 229 233 235- 
236 240-241 244 251-252 256 261- 
262 268-269 286 288 290 295 297 
301-302 309-310 315-317 321 324 
327 334 342 350 352-353 360 370- 
373 382 384 400 403 410 414-416 
424 430-431 436 445 454-456 461 
464-467 470 472 474-476 483 488 
497 500 504 506 513 516 519-520 
524 526 530-531 534 537-540 549 
554-555 565-566 569-570 572-573 
575-577 586-587 595 603-604 606 
612 630-632 634 636 647 650 657- 
660 666-667 669 673-675 678 698 
700 703 708 720 725-726 731 738- 
739 743-744 750-753 757 759 763- 
765 767 772-779 787 789-790 798 
800 810 823 829 834-836 841 848 
854-856 859 861 864 870-871 881 
890-891 89B 908-909 913 928 933 
941 949 958 961 963 967 969 975 
981 986 988-990 992 999 1007- 
1008 1014 1016 1039 1041 1073- 
1074 1079 1089 1097 1109 1114- 
1117 1122 1131 1140-1141 1144- 
1145 1163 1172 1175-1177 1186 
1196 1198 1206 1211 1216 1220 
1223 1227 1234-1243 1261-1262 
1267 1271 1280-1281 1284 1290 
1308 1317-1320 1322 1324-1325 
1327 1330 1334-1335 1339 1346 
1350-1351 1355 1357 1360 1370 
1374 1377-1379 1386 1389-1390 
1392 1397 1400 1402 1406-1407 
1417 1423 1425-1427 1440-1441 
1466 1474 1477 1483 1493 1498 
1504 1506 1525 1536 1545 1549 
1566 1594 1598-1600 1608 1611 
1614 1621 1623 1625 1632 1639 
1641 1644 1647 1649 1653-1656 
1658 1662-1663 1671 1673 1678- 
1681 1686-1688 1693 1705 1707 
1711 1717-1718 1726-1727 1731- 
1733 1737-1738 1743-1745 1758- 
1761 1771-1772 1779 1786 
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Tissue Origin 
thyroid gland 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



Clontech 



THRO 01 



4 9-10 20-21 37-39 48 50-51 54- 
57 60-61 65-66 71 83 94-96 98- 
100 102 104 110 112 115-117 119 
123 127 133 136-137 140 149 152- 
153 155-158 163-164 168-169 171 
186 190-192 197 201-203 219-220 
229 233-237 246-247 253 256 258 
262 265-266 268-269 277 280-281 
284-286 288-289 298-299 302 309- 
311 317 321 326 332 335 341-342 
344 348 350 354 358-359 363 368 
371-373 382-383 385 394 398 400- 
401 411 414-415 421 424 430-431 
433-436 443-446 450-452 454-455 
458 472-474 476-478 482 484-485 
487-488 490-494 496-497 500-501 
503-504 506 509-513 516-517 519 
524 526-527 529 535-540 547 549 
562 564 569-570 575-576 588 594- 
595 601-602 604 606 610 612 615- 
617 619-623 628-630 634-635 642 
647 649-651 660 662-665 668 670 
681 690-694 696 698 700 709 721 
727-729 732 734 738 740-741 743 
745 750 759 761 763 765 770 773 
780 785 795-796 798 802 804 823- 
824 826 828 833 838 841-845 847 
849 857-860 867 874-875 878 880- 
881 887-888 890-892 894-895 B98 
908 910-911 913-914 922-923 926- 
927 929 932-934 937 939 941-942 
948 953 957 961 963-964 966 978- 
979 981-982 987 990 992 1001 
1004-1006 1010 1014 1020 1024 
1033 1038-1039 1044 1047 1050 
1052-1054 1056 1058 1068 1070- 
1071 1077-1079 1088 1094-1097 
1105-1106 1112-1113 1116-1117 
1124 1126 1128-1129 1131 1134 
1136-1137 1142-1143 1146-1147 
1149-1150 1156 1161-1164 1167 
1170-1173 1177-1181 1190 1192 
1197 1200 1204 1208-1209 1214 
1217 1219 1222 1230 1232-1233 
1235 1241 1245 1247 1254 1257- 
1258 1260 1262 1271-1273 1283 
1286-1289 1299 1306 1314 1320 
1330-1332 1334-1335 1342 1345 
1349 1365-1367 1370-1372 1374 
1381 1394 1407 1419 1428'1436- 
1437 1440-1441 1443 1446-1449 
1454 1459 1461-1462 1468 1470- 
1471 1475 1477 1479 1482 1491 
1497-1498 1504-1505 1507 1513 
1522 1524-1526 1528 1531 1534 
1536-1537 1548 1550 1553 1555- 
1559 1562 1567 1578 1590-1591 
1597 1599-1601 1612 1614 1616 
1619-1620 1622 1624-1626 1628 
1631-1632 1634 1636 1639 1644- 
1645 1648 1651 1653-1656 1658 
1660 1662-1663 1667 1669 1671 
1675 1678-1681 1683-1686 1689 
1691-1692 1703 1709-1711 1717 
1724-1726 1729 1734 1737-1738 
1740 1743-1744 1749 1753 1759- 
1761 1770 1777 1786 



trachea 



Clontech 



TRC001 



9 29-31 44 48 87 104 107 110 13 ~ 5 ' 
158 222 262 266 286 301 318 331 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



352 372 377 384 414 424 445-444 
454 472 474 491 496 560 579 588 
593 597 607 612 626 681 702 719 
810 859 866 878 894-895 912 916 
922 932 935 1046 1075 1080 1099- 
1102 1113 1208 1215 1232-1233 
1237 1281 1312 1385 1387 1405 
1414 1424 1430 1437 1447 1505 
1S69 1579 1586 1600 1641 1653 
1667 1671 1676-1677 1683 1691- 
1692 1711 1717 1726 1772 



Clontech 



uterus 



UTR001 



17 19 25 41 46 57-5 
108 139 152 174 198 
263-265 274 290 387 
446 448 452 473 491 
506 513 519 522 526 
560 601 610 632 659 
773 780 833 845 857 
929 934 937 996 100 
1050 1075 1107 1124 
1258 1279 1287 1310 
1343-1344 1375 1437 
1478 1481 1498 1519 
1552 1579 1597 1602 
1626-1627 1649 1652 
1719 1722-1723 



8 61 89 104 
200-201 206 
408 420 438 
493 499 503 
530 542-543 
665 720 751 
872 877 912 

9-1011 1018 
1170 1219 
1320 1323 
1451-1452 
1521 1536 
1606 1620 
1661 1670 



TRADOCS:1416191.1(%CQN01!.DOC) 
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TABLE 2 



SEQ 

I ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 


1 


Y41736 


Homo 
sapiens 


Human PR01114 protein 
sequence. 


1398 


100 


2 


Y66656 


Homo 
sapiens 


Membrane -bound protein 
PR0943. 


2389 


99 


3 


AF113136 


Homo sapiens 


IL-1 receptor- associated- 
kinase-M; IRAK-M 


3043 


100 


4 


AF017806 


Mus musculus 


Zn-15 transcription factor 


6351 


77 


5 


X02761 


Homo sapiens 


fibronectin precursor 


10535 


98 


6 


X02761 


Homo sapiens 


fibronectin precursor 


8990 


B9 


B 


X02761 


Homo sapiens 


fibronectin precursor 


12564 


99 


9 


AJ011679 


Homo sapiens 


Rab6 GTPase activating 
protein, GAPCenA 


5251 


99 


10 


VI88501 


Homo sapiens 


Human stomach carcinoma clone 
HP104 15 -encoded protein. 


2381 


100 


11 


AP117754 


Homo sapiens 


thyroid hormone receptor- 
associated protein complex 
component TRAP240 


11336 


98 


12 


Z97630 


Homo sapiens 


dJ466N1.4 (novel protein 
similar to ANK3 (ankyrin 3 # 
node of Ranvier (ankyrin 
G) ) ) 


896 


100 


13 


Y58620 


Homo sapiens 


Protein regulating gene 
expression PRGE-13. 


1894 


98 


14 


AF213457 


Homo 
sapiens 


triggering receptor expressed 
on myeloid cells 2 


1238 


100 


16 


AF233453 


Homo sapiens 


RACK-like protein PRKCBP1 


3124 


99 


17 


AF201303 


Homo sapiens 


dhfr oribeta-binding protein 
RIP60 


3130 


98 


IB 


AF064205 


Homo sapiens 


dynactin 1 pl50 isoform 


6377 


100 


19 


U00059 


Saccharomyce 
s cerevisiae 


Yhrl21wp 


174 


26 


20 


AB032903 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1801 


99 


21 


AB032903 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1485 


99 


22 


AF140507 


Homo sapiens 


Ca2 + / calmodul in- dependent 
protein kinase kinase beta 


3083 


99 


23 


AF140507 


Homo sapiens 


Ca2 +/ calmodul in - dependent 
protein kinase kinase beta 


2300 


99 


24 


AJ2 89131 


Homo sapiens 


chondroitin 4-0- 
sulfotransf erase 


2211 


99 


25 


U33460 


Homo 
sapiens 


DNA- directed RNA polymerase 
I, largest subunit 


8777 


98 


26 


Y44488 


Homo sapiens 


ACRP30R2 variant protein. 


1387 


100 


27 


U43 701 


Homo sapiens 


ribosomal protein L23a 


791 


100 


2B 


U02032 


Homo sapiens 


ribosomal protein L23a 


767 


97 


29 


Y41324 


Homo sapiens 


Human secreted protein 
encoded by gene 17 clone 
HNFIY77 . 


1083 


99 


30 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


715 


90 


31 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


631 


82 


32 


AF231917 


Homo sapiens 


long- chain 2 -hydroxy acid 
oxidase HA0X2 


1811 


100 


33 


Z29481 


Homo sapiens 


3-hydroxyanthranilic acid 
di oxygenase 


1507 


99 


34 


AB001451 


Homo sapiens 


Sck 


2869 


100 


35 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1667 


99 


36 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1104 


98 


37 


Y78795 


Homo sapiens 


Human antiauai-2 (AZ-2) amino 
acid sequence. 


3586 


78 


38 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence. 


4726 


99 
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39 


Y78795 


Homo sapiens 


Human ahtIzuai-2 (AZ-2) amino 
acid sequence. 


3556 


77 


40 


U93121 


Homo sapiens 


M-phase phosphoprotein-l 


3747 


100 


41 


Y42750 


Homo sapiens 


Human calcium binding protein 
1 (CaBP-1) . 


795 


100 


42 


AP282626 


Homo sapiens 


latexin 


1189 


100 


43 


G0215O 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6231. 


384 


94 


44 


U19617 


Mus mus cuius 


Elr-1 


2724 


88 


45 


U19617 


Mus mus cuius 


Elf-1 


2062 


86 


46 


AF100758 


Homo sapiens 


osteoinductive factor OIF 


1538 


100 


47 


Y87591 


Homo sapiens 


Human SPROUTY-1 protein, SEQ 
ID NO: 24. 


1737 


99 


49 


X04145 


Homo eapiens 


T3 gamma precursor (aa -22 to 
160) 


942 


99 


51 


X63547 


Homo sapiens 


oncogene 


5845 


99 


52 


M94043 


Rattus 
norvegicus 


rab-related GTP -binding 
protein 


10B9 


96 


53 


L31783 


Mus mus cuius 


uridine kinase 


917 


71 


54 


X83973 


Homo sapiens 


transcription factor 


4486 


98 


55 


AF224741 


Homo sapiens 


chloride channel protein 7 


4128 


99 


55 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


1491 


100 


57 


Z50907 


Homo sapiens 


Human TBC-1 cDNA from second 
transcript . 


4824 


100 


58 


D79994 


Homo sapiens 


similar to ankyrin of 
Chromatium vinosum. 


6089 


99 


59 


D79994 


Homo sapiens 


similar to ankyrin of 
Chromatium vinosum. 


4014 


91 


60 


Y59738 


Homo sapiens 


Human normal ovarian tissue 
derived protein 15 . 


601 


100 


61 


AB031069 


Homo sapiens 


protein containing CXXC 
domain 1 


1390 


100 


62 


Y66660 


Homo 
sapiens 


Membrane -bound protein 
PR0783 . 


2492 


99 


63 


Y66660 


Homo 
sapiens 


Membrane -bound protein 
PR0783 . 


1709 


99 


64 


S70011 


Rattus ap. 


tricarboxylate carrier 


895 


55 


65 


AF139518 


Rattus 
norvegicus 


A-kinase anchor protein 


178 


24 


66 


W29566 


Homo sapiens 


Homo sapiens DH1308_1 clone 
secreted protein. 


157 


30 


67 


AJ24573 8 


Homo sapiens 


claudin-15 


1206 


100 


68 


AF099138 


Rattus 
norvegicus 


GLUT 4 vesicle protein 


4183 


87 


69 


AF099138 


Rattus 
norvegicus 


GUJT4 vesicle protein 


4906 


86 


70 


Z82059 


Caenorhabdit 
is elegans 


Similarity to Drosophila ring 
canal protein comes from 
this gene 


1285 


44 


71 


AF224278 


Homo sapiens 


PMEPA1 protein 


1282 


100 


72 


AF126426 


Homo sapiens 


neurotrimin 


1809 


100 


73 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence. 


2065 


99 


74 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence. 


1207 


100 


75 


AF188622 


Mus mus cuius 


selectively expressed in 
embryonic epithelia protein- 1 


1485 


74 


76 


AE000406 


Escherichia 
coli 


putative DNA topoi some rase 


950 


100 


77 


X99302 


Homo sapiens 


Popl 


655 


100 


78 


AL136538 


Schizosaccha 

romyces 

pombe 


similarity to S. cerevisiae 
ktil2 protein 


210 


31 


79 


AF1297S* 


Homo sapiens 


G4 


1554 


99 
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80 


AbU9b76o 


Homo sapiens 


OJ858B16 . 2 
(phosphatidyl serine 
decarboxylase (PSSC, EC 

ATI CC \ \ 


2033 


100 


HI 




nuuio Sapiens 


H.TPCQB1 C, "3 
\MJ OJgOlD . £. 

decarboxylase (PSSC, EC 
4.1.1.65) ) 


1220 


96 


82 


X57351 


Homo sapiens 


1-8D 


677 


98 


83 


AC005594 


Homo sapiens 


R26984_l 


2700 


98 


84 




xiuiutj sapiens 


■Faa#* MirllD.P 

LaoL riyuir-^- 


5959 


99 


□ D 




Homo sapiens 


ni cnioriae cnannei; p64Hl; 

l*Ull«4 


1305 


99 


86 




s rtiu s cu i us 


SH2 domain -containing protein 


1360 


78 


87 


AF272151 


Homo sapiens 


adaptor protein CIKS 


3084 


99 


o a 


Ar ± y o jj z y 


Homo 
sapiens 


triggering receptor expressed 
on monocytes 1 


1214 


100 


89 


AB016879 


Arabidopsis 
cnaiiana 


contains similarity to pre- 
mRNA splicing 
factor~gene_id:MRBl7 . 2 


634 


36 


y u 




MUS IT1US CUIUS 


homeodomain protein 


654 


57 


91 


AJ242864 


Mus mus cuius 


phtf protein 


619 


61 


92 


A61971 


unidentified 


MCSP 


11676 


99 


93 


VQ Q 1 C 

Y"" J o d 


Homo sapiens 


Human PRO1250 (UNQ633) amino 
acid sequence SEQ ID NO .-86. 


3890 


100 


94 


Y87231 


Homo sapiens 


Human signal peptide 
containing protein HSPP-8 
SEQ ID NO: 8. 


1031 


100 


nr 


HO1 1 T7/1 T 


Rattus 
norvegicus 


protein kinase WNK1 


2428 


95 


96 


AF227741 


Rattus 
norvegicus 


protein kinase WNKl 


1961 


94 


97 


Y92513 


Homo sapiens 


Human OXRE-10, 


1626 


100 


98 


AL021366 


Homo sapiens 


CICK0721Q.3 (Kinesin related 
protein) 


3423 


100 


99 


AC005783 


Homo sapiens 


R33083_l 


1974 


99 


100 


Y95293 


Homo sapiens 


Human GEP containing NEK- like 
kinase substrate sGNK. 


4092 


99 


101 


M -|i QCA1 

ALllbbOl 


Homo sapiens 


dJ1191Nl6.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em: AL050069) ) 


1509 


100 






Homo sapiens 


ClpX-like protein 


3233 


100 


103 


AF100753 


Homo sapiens 


ancient ubiquitous 46 kDa 
protein A17P1 


.2042 


96 


J. U*l 




Homo sap i ens 


serine/threonine kinase 


4718 


100 




AclOlU f4 


Homo sapiens 


HSPC240 


831 


64 


1UO 


M35522 


Canis 
familiar is 


GTP-binding protein (rab7) 


354 


50 


1U / 


Ky y b uu 


Homo sapiens 


NTII-1 nerve protein, 
facilitates regeneration of 
nerve cells. 


2337 


93 


108 


AF125533 


Homo sapiens 


NADH- cytochrome b5 reductase 
isofortu 


1290 


93 


109 


AC005614 


Homo sapiens 


F23265_2 


3369 


99 


inn 


Ac Uo4 / ^jy 


Homo sapiens 


RAN binding protein 16 


3285 


100 


111 
111 


X52425 


Homo sapiens 


interleukin 4 receptor 


4496 


100 


112 


Y41686 


Homo 
sapiens 


sequence. 




i nn 


113 


W15506 


Homo sapiens 


Mitogen activating protein 
kinase ERK1. 


1991 


100 


114 


Y71071 


Homo sapiens 


Human membrane transport 
protein, MTRP-16. 


1190 


99 


115 


AL049548 


Homo sapiens 


dJ398G3.1 (ortholog of rat 
CPG2) 


3497 


99 


116 


AF189817 


Mus musculus 


evectih-2 


1124 


90 


117 


W30891 


Homo 


Human cytostatin III protein. 


715 


99 
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sapiens 






_ 


118 


AF116618 


Homo sapiens 


PRO1038 


1469 


100 


119 


Y08915 


Hotno sapiens 




1 "7 A Q 


100 


120 


AF098070 


Drosophila 
mc 1 anoga s t e r 


Li si homolog 


192 


39 


121 


AF052432 


Homo sapiens 


katanin p8 0 subunit 


181 


37 


122 


Y70743 


Homo p. ^ ti i pnn 


trociVc J- protein encoueci oy 
NSEQ gene associated with 

matrix TPmnH 1 1 -j n rr 


2637 


98 


123 


AF083246 


Homo sapiens 


HSPC028 


2132 


100 


124 


Y27096 


Homo sanipnq 


Human t/i m1 **a*"»ar"»fr*o»~ K « ■{ »i 

nuiiiau viidi tecepcoir procein 
(ACVRP) 


833 


99 


125 


M63109 


Leishmania 
ma jor 


glycoprotein 96-92 


172 


27 


126 


U75467 


Drosophila 
melanogaster 


Atu 




i © 


127 


Z6B220 


Caenorhabdit 
is elegans 


Similarity to Human AD P/ ATP 
carrier nror^in 


438 


43 


128 


AF095927 


Rattus 
norvegicus 


protein phosphatase 2C 


1927 


94 


129 


W92958 


Homo anni on o 




463 


100 


130 


AF1153 91 


i_lct<— L d L. J. XI U 

s sakei 




508 


3 7 


131 


X93498 


Homo sapiens 


21-Glutamic Acid-Rich Protein 


1250 


100 


132 




Homo sapiens 


21 -Glutamic Acid- Rich Protein 


916 


87 


133 


W52B11 


Homo sapiens 


Human DBI/ACBP -like protein 
(DBIH) . 


705 


97 


1 "iA 


VO ft A A A 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


3230 


100 


135 


M69181 


Homo sapiens 


non-muscle myosin B 


189 


20 






Homo sapiens 


Human secreted protein 
encoded by gene 154 clone 

HTTST7T.ft'3 
luiuriio J . 


480 


100 


137 


W78200 


Homo sapiens 


Human secreted protein 

ciiLuucu uy yene /d cionc 
HHGAU81. 


655 


99 


138 


AL033520 


Homo car>i one 
nuuiu set ui. cjid 


uu . i \sxmxxar co 
KIAA0701 nrot-iMnl 


424 


39 


139 


AF020261 


Santalum 
album 


proline rich protein 


119 


30 


140 


X70394 


Homo sapiens 


zinc finger protein 


1634 


100 


141 


Y06439 


Homo a a rv i pn c 


Human nmhoaea UTTt3M_Q 
ilLUUcail JLJJ. W Lea b fcJ tlUfn" 0 . 


QIC 


100 


142 


Z68493 


Caenorhabdit 


predicted using Genefinder 


365 


42 


143 


AB018107 


Arabidopsis 
thaliana 


ADP-ribosylation factor-like 
protein 


596 


65 


144 


AF1G 1483 


Uomo oani a r*i c 
XlvJlTUJ ocipjLctlo 


UQDP1 1A " 


580 


51 


Us 


Y84902 


Homo sapiens 


A. human proliferation and 
apoptosis related protein. 


480 


100 


146 


AB004906 


Ipomoea | 

purpurea 


transposase 


146 


20 


147 


AC007357 


Arabidopsis 


F3F19.18 


647 


31 


148 


W75155 


Homo sapiens 


Human secreted protein 
encoded by gene 41 clone 
HNTME13 . 


1494 


98 


149 


AF05649D 


Homo sapiens 


cAMP- specif ic 
phosphodiesterase 8A 


3710 


99 


150 


Y58171 


Homo 
sapiens 


Human hydrolase homologue 
HHH-7. 


785 


99 


151 


U10397 


Saccharomyce 
s cerevisiae 


Yhrl48wp 


Sl5 


53 


152 


X73478 


Homo sapiens 


phosphotyrosyl phosphatase 
activator 


1719 


99 


153 


AL049697 


Homo sapiens 


dJ382H0.5.i (novel protein 


2034 


99 
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similar* to ;i'ra'irrv/T - t*PNZl^ 






154 


AP169802 


Homo sapiens 


cytochrome b5 reductase b5R.2 


1455 


99 


155 


X94703 


Homo *3ar>i pnq 


rat>2 8 " ' ~~ 


1126 


99 


156 


Y25716 


Homo sapiens 


Human secreted protein 
encoded from gene 6 . 


1471 


100 


158 


W77404 


Homo sapiens 


Secreted salivary polypeptide 
zsig32 ♦ — 


937 


100 


159 


Y17248 


Homo QanT^nR 


Human protein Kinase 
inhibitor- 2 (PKI-2) . 


383 


100 


160 


J04970 


Mnmn oani ano 

nuuiu octpxeiib 


carboxypept idase M precursor 


2395 


100 


161 


W54040 


Homo sapiens 


Human interferon-inducible 
protein, HIFI. 


484 


98 


162 


AL022724 


Homo sapiens 


dJ413H6.1.1 (hamster 
Androgen- dependent Expressed 
Protein LIKE PUTATIVE 
piOLciiii \isoiorm u 


1357 


100 


163 


AF12S535 


Homo sapiens 


pp21 homolog 


193 


45 


164 


Ou J D J ^ 


ncjuiv-/ aapXcllo 


Human secreted protein, SEQ 

Tn MO . 7T1 O 


463 


97 


165 


AJ250839 


Homo sapiens 


serine/threonine protein 
kinase 


1442 


71 


166 


L09649 


Zymomonas 

1UOD111S 


zm2 


173 


37 


167 


Y73337 


Homo sapiens 


HTRW clone 1944530 protein 
sequence . 


1204 


100 


168 




Homo sapiens 


Secreted protein encoded by 
gene ±xz. cione xiuivr l. / J. . 


1084 


100 


169 


AP214731 




Aif-uepenaent kin a. nex lease 


4402 


100 


170 


AE000871 


rium 

h Viprmoaii t* oh r* 
unci, uiwau t^x. 

ophicum 


conserved protein 


166 


27 


171 


Y27684 


Homo qsni ptic 


Human secreted protein 
encoded by gene No. 118. 


821 


100 


172 


AF226044 


TTnTtin can*! pnc 


noii c s\i\ 


2 904 


100 


173 


AJ24S946 


Homo sapiens 


neuroglobin 


779 


100 


174 


D43 949 


nuuiu bdpicIlS 


This gene is novel . 


3202 


100 


175 


Y07923 




GTP-binding protein 


1205 


100 


17£ ~ '" 


»• _Z V J J a 




Human DPI homologue protein. 


966 


100 


177 


Y41675 


Homo sani one 


nuuian Lndjinei-reidtea 
molecule HCRM-3 . 


1x22 


100 


178 


Y41674" 


Homo sani'pnff 


Human ^hannal _*>a] 

molecule HCRM-9 


___ 

"jo 


99 


179 


AF220492 


Homo sapiens 


krueppel-like zinc finger 
protein HZF2 


4100 


99 


180 


X03084 


Homo sapiens 




1 0 A ft 


100 


181 


U57344 


Mus musculus 


Meis3 


1813 


8 9 


183 


U57344 | 


Mus musculus 


Meis3 


i r 1 J 


Q c 
oo 


184 


U57344 


Mus musculus 


Meis3 


1070 


86 


185 


AF033120 


Homo sapiens 


p53 regulated PA26-T2 nuclear 

\J I. UUClU 


1389 


58 


186 


AF200357 


Mus musculus 


pantothenate kinase 1 beta 


160S 


82 


187 " 


W75058 


Hrtmr> oani one 
rnjluvj oapxciia 


Human secreteel protein 
encoded by gene 2 clone 
HLDBG33. 


1188 


99 


188 


AJ292529 


Homo sapiens 


suppressor of sterile four 1 


2424 


100 


190 


X54134 


Homo sapiens 


protein-tyrosine phosphatase 


3705 


100 1 


191 


Y22203 


Homo sapiens 


Human calcium-binding 
phosphoprotein, CBPP-i, 
protein sequence. 


1083 


99 


192 


W63692 


Homo 
sapiens 


Human secreted protein 12. 


1975 


100 


193 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


2605 


99 
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194 


AF084254 


Mue mu s cuius 


broraodomain- containing 
protein BP75 


693 


54 


195 


Y00752 


Rattus 
norveg 1 cu s 


serine dehydratase (AA 1 - 
S£ 1 ) 


994 


61 


196 


W95349 


Homo sapiens 


Human foetal brain secreted 
procem uu / u / . 


2596 


100 


197 


AB028859 


Homo sapiens 


hDj9 


1890 


100 


198 


W95633 


Homo sapiens 


Homo sapiens secreted protein 
gene clone hm236_l. 


1614 


100 


199 


V / A 1 11 


Homo 
sapiens 


Human nucleic acid methylase- 
2. 


2096 


99 


200 


AB030039 


Homo sapiens 


hPACPLl 


2258 


100 


201 


X54162 


Homo sapiens 


64 Kd autoantigen 


2918 


99 


202 


G02061 


Homo sapiens 


Human secreted protein, SEQ 
ID NO : 6142 . 


558 


99 


203 


X13885 


Nicotiana 
t aba cum 


extensin (AA 1-620) 

- 


185 


33 


204 


J04204 


Bos taurus 


32 kd accessory protein 


1837 


100 


205 


0*04204 


Bos taurus 


32 kd accessory protein 


1101 


100 


207 


Y87283 


Homo sapiens 


Human signal peptide 
containing protein HSPP-60 
SEQ ID NO: 60. 


1318 


100 


208 


Y02860 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


936 


98 


209 


AL121889 


Homo sapiens 


dJ107^E17.1 (KIAA0 823 protein 
(continues in AL023803)) 


£94 


54 


210 


AF226732 


Homo sapiens 


NPD007 


1345 


76 


211 


X66295 


Mus musculus 


Clq C chain 


970 


73 


212 


Z29328 


Homo sapiens 


Ubiqui tin- conjugating enzyme 
UbcH2 


966 


100 


213 


Z29328 


Homo sapiens 


Ubiqui tin- conjugating enzyme 
UbcH2 


542 


98 


214 


AJ002030 


Homo sapiens 


progresterone binding protein 


1163 


100 


215 


X70649 


Homo sapiens 


member of DEAD box protein 
family 


3933 


100 


216 


AF250558 


Homo sapiens 


claudin-2 


1169 


99 


217 


AL021453 


Homo sapiens 


dJ821Dll.l (PUTATIVE protein) 


259 


100 


218 


Y08565 


Homo sapiens 


UDP-GalNAc: polypeptide N- 

acetylgalactosaminyltransfera 

se 


3331 


99 


219 


Y94452 


Homo sapiens 


Human inflammation associated 
protein 


2067 


100 


220 


AL035521 


Arabidopsis 
thaliana 


putative protein 


315 


42 


221 


AL031786 


Schizosaccha 

romyces 

pombe 


putative proline-trna 
synthetase 


811 


41 


222 


AL109736 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


626 


40 


223 


X52493 


Glycine max 


DNA-directed RNA polymerase 


136 


23 




AL03 5659 


Homo sapiens 


dJ979Nl.l (dJ979Nl.l) 


5199 


98 


225 


AB032401 


Mus musculus 


mmDj4 


1761 


92 


226 


AB032401 


Mus musculus 


mmDj4 


1988 


92 


227 


X83502 


Saccharomyce 
s cerevisiae 


J1007 


112 


2£ 




Ad jdUc 


Saccharomyce 
s cerevisiae 


J1007 




25 


229 


AF143723 


Homo sapiens 


heat shock protein HSP60 


2557 


99 


230 


Y66677 


Homo 
sapiens 


Membrane - bound protein 
PR0828 . 


982 | 


100 


231 


AB027466 


Homo sapiens 


spondin 2 


1756 


99 


232 


W95634 


Homo 
sapiens 


Homo sapiens secreted 
protein. 


1391 


100 


233 


W00365 


Homo sapiens 


Human eye 1 in Bl. 


2218 


99 


234 


YS3762 


Homo sapiens 


A GTP-binding polypeptide 


1017 


100 
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TABLE 2 



SEQ 
ID 
NO • 


NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








designated RAQ. 






A J D 


6311 /4y 


Homo sapiens 


yeast sds22 homolog 


1800 


100 


236 


Z50749 


Homo sapiens 


yeast sds22 homolog 


1754 


98 


237 


AB026491 


Homo sapiens 


PICK1 


2137 


100 


238 


AJ270205 


Entodinium 
cauda turn 


putative 

phosphatidyl inositol -4 - 
phosphate 5-kinase 


114 


37 


239 • 


AB030189 


Mus musculus 


contains transmembrane (TM) 
region and ATP binding region 


710 


"93 


24 0 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3785 


99 


241 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3436 


99 


242 


AF155107 


Homo sapiens 


NY -REN- 3 7 antigen 


996 


99 


243 


AF155107 


Homo sapiens 


NY- REN- 3 7 antigen 


1005 


100 


244 


AL031320 


Homo sapiens 


dJ2 0N2.1 (novel protein 
similar to yeast and 
bacterial cytosine 
deaminase) 


763 


99 


245 


U37026 


Rattus 
norvegicus 


sodium channel beta 2 subunit 


162 


30 


246 


AL078599 


Homo sapiens 


dJ991C6.1 (novel protein 
similar to C. elegans 
F55A12.9 (Tr:P91086)) 


2391 


98 


24 7 


1732274 


Saccharomyce 
s cerevisiae 


Ydr3 86wp;-CAI: 0.12 


191 


37 


Us 


Y41719 


Horao 
sapiens 


Human PROS 6 4 protein 
sequence . 


1879 


100 


249 


AB029434 


Homo sapiens 


ghrelin precursor 


611 


100 


250 


X97831 


Rattus 
norvegicus 


carnitine/acylcarnitine 
carrier protein 


246 


38 


251 


W80993 


Homo 
sapiens 


Human RIP- interacting factor 
RIF. 


1724 


100 


252 


Y94873 


Homo 
sapiens 


Human protein clone HP02632. 


1876 


100 


253 


W59878 


Homo sapiens 


Amino acid sequence of the 
cDNA clone AIF-2 (HEBGM49) . 


765 


100 


254 


AL354533 


Leishmania 
major 


possible adenylate kinase 


265 


34 


255 


AF233322 


Mus musculus 


zinc transporter like 2 


1916 


95 


256 


Y78113 


Homo sapiens 


Human cytokine signal 
regulator CKSR-1 SEQ ID 

NO:l. 


"2247 


99 


257 


AL035539 


Arabidopsis 
thai i ana 


putative amino acid transport 
protein 


390 


27 


258 


W74787 


Homo sapiens 


Human secreted protein 
encoded by gene 58 clone 
HHFHN6 1 . 


1171 


100 




AL03b689 


Homo sapiens 


dJ187Jll.l (novel protein 
similar to protein kinase C 
inhibitors) 


974 


100 






Methanobacte 1 
rium 

thermoautotr 
ophicum 


serine/ threonine protein 
kinase related protein 


363 


30 


261 


AL050131 


Homo sapiens 


hypothetical protein 


626 


100 


262 


AF019661 


Mus musculus 


<jGL.ci jpiULcdoOUie CiiclJ.il, ril'lrtJ 


1214 


100 


263 


AL035593 


Homo sapiens 


dJ310J6.1 (novel protein) 


821 


100 


"2^4 


AL022318 - 


Homo sapiens 


bK150C2.3 (PUTATIVE novel 
protein similar to APOBEC1) 


1072 


100 | 


265 


AF205940 


Homo sapiens 


endomucin 


1289 


100 


266 


AL023583 


Homo sapiens 


dJ500L14.1 (novel protein) 


789 


100 


267 


AL034548 


Homo sapiens 


dJ1103G7.3 (novel protein 
kinase domains containing 
protein similar to 
phosphoprotein CBFW) 


1888 


99 
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ID 
NO: 


NUMBER 




Ur*o CR I PI I ON 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


268 


AF161470 


Homo sapiens 


HSPC121 


1884 


98 


269 


AF161470 






1232 


96 


270 


X90763 


Homo 


HHa5 hair keratin type I 
AiiLciiiicuiaLc tiiamenc 


2190 


99 


271 


AF207600 


Homo sapiens 


ethanolamine kinase 


1952 


100 


2 72 


M32334 


Homo sapiens 


intercellular adhesion 
molecule 2 


1436 


100 


273 


AF161483 


Homo sapiens 


HSPC134 


663 


61 


274 


Y53052 


Homo sapiens 


Human secreted protein clone 
df202_3 protein sequence SEQ 
ID NO: 110. 


587 


100 


276 


Y77576 


Homo sapiens 


Human cytoskeletal protein 
(HCYT) (clone 2195418) . 


762 


100 


277 


AFQ77042 


Homo sapiens 


3 OS ribosomal protein S7 
homo log 


1269 


100 


278 


Y94907 


Homo sapiens 


Human secreted protein clone 
cal06*_19x protein sequence 
SEQ ID NO : 2 0 . 


1619 


98 


o n q 


xbo /ob 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-20. 


2801 


'99 


280 


Z75134 


Can is 

£ ami liaris 


rod transducin 


1816 


100 


281 


27 513 4 


Can is 

f 1 1 3.1^13 


rod transducin 

- 


1718 


96 


282 


■rVr ctjO f J 


rnjinu oapiens 


muscle-specific protein 


1395 


100 


283 


ALO5O007 


Homo sapiens 


hypothetical protein 


405 


98 


284 


/ir ^ u J. jj J. 


Homo sapiens 


DC1 


1859 


99 


285 




Homo sapiens 


ELL complex EAP30 subunit 


1318 


99 


286 


V-a soon 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 


1250 


99 


287 


U88964 


Homo sapiens 


HEM45 


923 


100 


288 




numo oapisns 


hypothetical protein 


598 


100 


289 


AiTOI 1 OQR 
nu v xxu j u 


Homo sapiens 


telethonin 


574 


100 


290 


Y66724 


Homo 
sapiens 


Membrane -bound protein 

rHUo J b . 


2321 


100 


291 


AF034801 


Homo sapiens 


Iiprin-alpha4 


2565 


98 




At 0 J4t?01 


Homo sapiens 


liprin-alpha4 


2590 


100 






Homo sapiens 


dJ889J22B.l (novel protein 
(isoform 1) ) 


1738 


100 




Y73348 


Homo sapiens 


HTRM clone 83 9651 protein 
sequence . 


1245 


99 




T.I 1 C70 


Homo sapiens 


zinc finger protein 


1694 


44 


296 


AL035423 


Homo sapiens 

- 


dJ20I3.1 (brain mitochondrial 
carrier protein- 1 (BMCP1) > 


1024 


79 


297 




Homo sapiens 


lymphoid enhancer binding 
factor- 1 


2173 


100 


298 




Homo sapiens 


HS PC2 9 9 


1147 


85 


299 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 j 


1236 


99 


300 


U26397 


Rat bus 
norvegicus 


inositol polyphosphate 4- 
phosphatase 


160 


30 


301 


nrUJ 


Homo sapiens 


meningioma -expressed antigen 
5 


3458 


100 




Z82022 


Homo sapiens 


GicNac-l-P transferase 


2067 


99 


303 


"AF26923;2 


"WO lUUcLUllib 


oucyropniiin-iiKe orotein 
BUTR-1 


271 


50 


304 


AJ222S44 


Arabidopsis 
thaliana 


asparaginyl-tRNA synthetase 


659 


50 


305 


AF0541BO 


Homo 
sapiens 


Hematopoietic cell derived 
zinc finger protein 


351 


79 


306 


AJ272Q79 


Homo sapiens 


APOBEC-1 stimulating protein 


3056 


100 


308 


Y44486 


Homo 
sapiens 


Human GPRW receptor 
polypeptide. 


1721 


100 


309 


AJ131891 


Homo sapiens 


DNA polymerase mu 


2598 


100 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- ' 
WATERMAN 
SCORE 


% 

IDENTITY 


310 


AF293335 


Homo sapiens 


p3 0 DBC 


1248 


92 


311 


AF176525 


Mus musculus 


F-box protein FBL12 


1501 


93 


312 


X57802 


Homo sapiens 


immunoglobulin lambda light 
chain 


959 . 


81 


313 


Z36715 


Homo sapiens 


Net 


204B 


98 


314 


AF161532 


Homo sapiens 


HSPC047 


727 


100 


315 


AF208068 


Homo sapiens 


kelch-like protein KLHL3a 


3046 


100 


316 


Y66666 


Homo 
sapiens 


Membrane -bound protein 
PRO1013. 


1166 


100 


317 


Y29666 


Homo sapiens 


Human Ras protein RAPR-1. 


1253 


98 


318 


AJ387747 


Homo sapiens 


sialin 


2614 


99 


319 


AF161362 


Homo sapiens 


HSPC099 


224 


40 


320 


Y68773 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-5. 


2243 


99 


321 


AJ238379 


Homo sapiens 


putative TH1 protein 


3013 


100 


322 


AB040812 


Homo sapiens 


protein kinase PAK5 


3792 


99 


323 


Y9S013 


Homo sapiens 


Human secreted protein 
vc48_l, SEQ ID NO: 66. 


913 


100 


324 


Y13381 


Homo sapiens 


Amino acid sequence of 
protein PR0271. 


1976 


100 


325 


Y94944 


Homo sapiens 


Human secreted protein clone 
bfl57_JL6 protein sequence 
SEQ ID NO: 94. 


2305 


98 


32S 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein- 7sequence . 


6728 


99 


327 


AF198532 


Homo sapiens 


lymphoid enhancer binding 
f actor-1 


2173 


100 


328 


Z78013 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin-related tumor 
suppressor 


569 


33 


329 


AF212921 


Mus musculus 


MMTV receptor variant 1 


484 


94 


330 


Z75330 


Homo 
sapiens] 
>R65207 
R65207 02- 
MAR-1995 27- 
AUG- 1993 
Human 

stromal in-l. 

[Homo 

sapiens 


nuclear protein SA-1 


6492 


99 


331 


AL008583 


Homo sapiens 


dJ327J16.3 (supported by 
GENS CAN, FGENES and GENEWISE) 


2133 


99 


332 


Y36104 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
489. 


310 


41 


333 


AJ271669 


Homo sapiens 


putative sialoglycoprotease 


1747 


100 


334 


AF156598 


Mus musculus 


p53 -regulated DDA3 


997 


64 


335 


M99058 


Eimeria 
maxima 


emlOO gene is homologous the 
Eimeria tenella gene etlOO 


154 


26 


336 


Y85564 


Homo sapiens 


Human horaologue of UNC-53 
(Hs-UNC-53/1) sequence . 


3386 


97 


337 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


2602 


94 


338 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


3447 


98 


339 


Z665(Sl 


Caenorhabdit 
is elegans 


Similarity to Human rabl3 
protein (PIR Acc. No. 
A49647) . 


716 


34 


340 


AB021643 


Homo 
sapiens 


gonadotropin inducible 
transcription repressor-3 


2761 


99 


341 


G01946 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6027. 


465 


98 


342 


AF020591 


Homo sapiens 


zinc finger protein 


1091 


48 


343 


L29154 


Homo sapiens 


immunoglobulin heavy chain 


439 - 


84 
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ID 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








VDJ region 








U10281 


Sus scrofa 


gastric mucin 


279 


24 


345 


AK000404 


Homo sapiens 


unnamed protein product 


1177 


99 


346 


L22557 


Rattus 
norvegicus 


calmodulin-binding protein 


1949 


84 


347 


L22557 


Rattus 
norveg i cus 


calmodulin -binding protein 


2363 


91 


"i A R 


AT f\A QA Ol 


Arabidopsis 
cnai lana 


AIGl-like protein 


316 


30 


350 


AJ251516 


Mus musculus 


cysteine and histidine-rich 
protein 


1460 


99 




AK024477 


Homo sapiens 


FLJ00070 protein 


1773 


100 


352 


U50133 


Homo sapiens 


ankyrin 


502 


33 


353 


AK000625 


Homo sapiens 


unnamed protein product 


721 


100 


354 


AF161420 


Homo sapiens 


HSPC3 02 


2623 


97 


355 


AJ010014 


Homo sapiens 


M96A protein 


1269 


47 


356 


AF151029 


Homo sapiens 


HSPC195 


941 


91 


357 


AL022327 


Homo sapiens 


dJ355C18.1 {KIAA0027) 


1911 


100 


358 


W78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96. 


1117 


100 


359 


X03414 


Drosophila 
melanogaster 


Kr polypeptide 


316 


45 


3 60 


AF151079 


Homo sapiens 


HSPC24 5 


643 


100 


361 


Y53886 


Homo sapiens 


A suppressor of cytokine 
signalling protein 
designated HSCOP-6. 


530 


41 


3 62 


AF254741 " 


Drosophila 
melanogaster 


Centaurin Gamma 1A 


681 


46 


363 


AF213465 


Homo sapiens 


dual oxidase 


2016 


100 


3 64 


AF181562 


Homo sapiens 


proSAAS 


1319 


100 


365 


AF181562 


Homo sapiens 


proSAAS^ 


1024 


99 


366 


U73200 


Mus musculus 


pll6Rip 


804 


82 


3*7 


AF263744 


Homo sapiens 


erbb2- interacting protein 
ERBIN 


4973 


99 


368 


U37501 


Mus musculus 


laminin alpha 5 chain 


5867 


72 


369 


AF043695 


Caenorhabdit 
is elegans 


similar to the protein 
phosphates 2c family 


549 


36 


370 


Y73440 


Homo sapiens 


Human secreted protein clone 
yj23_l protein sequence SEQ 
ID N0:102. 


1484 


99 


371 


AF272833 


Homo sapiens 


misato 


2869 


97 


372 


AF198454 


Homo sapiens 


epithelial protein lost in 
neoplasm beta 


3927 


100 


373 


Y73345 


Homo sapiens 


HTRM clone 438283 protein 
sequence . 


273 


80 


374 


AF169017 


Homo sapiens 


formiminotransf erase 
cyclodeaminase 


2717 


98 


375 


A9510€ 


unidentified 


RED ALPHA 


1202 


99 


376 


W74828 


Homo sapiens 


Human secreted protein 
encoded by gene 100 clone 
HLQAB52 . 


1012 


99 


377 


Y32131 


Homo sapiens 


Human LYST-2 protein. 


3556 


99 


378 


M14912 


Homo sapiens 


pol 


132 


86 


379 


AF090934 


Homo sapiens 


PRO0518 


3 82 


100 


380 


X66363 


Homo sapiens 


serine/threonine protein 
kinase 


2499 


100 


381 


Y41699 


Homo 
sapiens 


Human PRO703 protein 
sequence . 


2362 


100 


382 


AF174498 


Homo sapiens 


GR AF-1 specific protein 
phosphatase 


7008 


98 


383 


U64608 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
ykl73cl2.5 


246 1 


36 


384 


U50133 


Homo sapiens 


ankyrin 


502 


33 


385 


AJ238520 


Homo sapiens 


putative transcription 
factor- like nuclear regulator 


4123 


97 
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SMITH- 
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SCORE 


V 

IDENTITY 


387 


AF208845 


Homo sapiens 


BM-003 


1375 


99 


389 


X57821 


Homo sapiens 


immunoglobulin lambda light 
chain 


797 


76 


390 


AF182404 


Homo sapiens 


mitochondrial uncoupling 
protein 1 


1670 


99 


391 


Y8<^64 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/l) sequence. 


3386 


97 


393 


AF178432 


Homo sapiens 


SH3 protein 


3700 


100 


394 


AF229928 


Drosophila 
melanogaster 


cytoplasmic protein 89BC 


l£l£ 


62 


395 


AF181721 


Homo sapiens 


RU2S 


2254 


100 


396 


Y69197 


Homo sapiens 


Amino acid sequence of a 
human betalV-spectrin 
protein. 


1626 


98 


397 


U48238 


Mus musculus 


zinc finger protein neuro-d4 


749 


60 


398 


AL390137 


Homo sapiens 


hypothetical protein 


263 


51 


399 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 


400 


AL022599 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


447 


27 


401 


AC004B59 


Homo sapiens 


similar to 2-oxoglutarate 
dehydrogenase similar to 
Q02218 (PID:gl352618) 


4176 


78 j 


402 


AB010266 


Mus musculus 


tenascin-X 


10246' 


62 


403 


AL133288 


Homo sapiens 


dJ671D7.1 (similar to 
D. melanogaster CG5986 
protein) 


761 


100 


404 


Z68753 


Caenorhabdit 
is elegans 


ZC518.3b 


888 


48 


405 


Z78013 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


40<J 


AB031230 


Homo sapiens 


protein containing CXXC 
domain 2 


1196 


97 


407 


AF155106 ! 


Homo sapiens 


NY-REN-36 antigen 


1168 


100 


408 


Y57945 


Homo sapiens 


Human transmembrane protein 
HTMPN-69. 


1538 


99 


409 


Z18361 


Ovis aries 


trichohyalin 


184 


30 


410 


AF249744 


Homo sapiens 


RhoGEF 


2733 


100 


411 


AF176529 


Mus musculus 


F-box protein FBX13 


2072 


94 


412 


AF210842 


Homo sapiens 


HARP 


4880 


100 


413 


AL031658 


Homo sapiens 


dJ310O13.7 (novel protein 
similar to H. roretzi HRPET- 
3) 


776 


98 


414 


X57398 


Homo sapiens 


pm5 protein 


6131 


99 


415 


AB029826 


Homo sapiens 


3 - me t hyl c ro t onyl - CoA 
carboxylase biot in- containing 
subunit 


2961 


99 


416 


U43503 


Saccharomyce 
s cerevisiae 


Lphlp 


115 


42 


417 


AL160493 


Leishmania 
major 


possible t26fl7.21 


239 


35 


418 


Y08100 


Homo sapiens 


Human PR0331 protein. 


330 


29 


419 


U15131 


Homo sapiens 


pl26 


2228 


54 


420 


AF117946 


Homo Bapiens 


Link guanine nucleotide 
exchange factor II 


2363 


100 


421 


AF190635 


Drosophila 
melanogaster 


anJcyrin 2 


755 


30 


422 


AF302150 


Homo 
sapiens 


phosphoinositol 3 -phosphate- 
binding protein-2 


1962 


100 


423 


AL137530 


Homo sapiens 


hypothetical protein 


433 


94 


424 


X63753 


Homo sapiens 


son-a 


7269 


100 


425" 


AB027249 


Homo sapiens 


MAPKK like protein kinase 


1693 


100 


426 


AF279144 


Homo sapiens 


tumor endothelial .marker 7 
precursor 


1084 


55 
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DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


427 


AF279144 


Homo sapiens 


tumor endothelial marker 7 
precursor 


1259 


56 


428 


AE003683 


Drosophila 
melanogaster 


CG8312 gene product 


149 


29 


429 


Y07829 


Homo sapiens 


RING finger protein 


2201 


99 ; 


430 


AF096897 


Drosophila 
melanogaster 


pushover 


4442 


47 


431 


U41387 


Homo sapiens 


Gu protein 


4021 


99 


432 


AF023674 


Homo sapiens 


nephrocystin 


3783 


100 


433 


AF146760 


Homo 
sapiens 


septin 2-like cell division 
control protein 


2284 


100 


434 


AB006697 


Arabidopsis 
thaliana 


cleft lip and palate 
associated transmembrane 
protein- like 


686 


42 


437 


Y94247 


Homo sapiens 


Human calcium binding protein 
hCBP. 


1704 


100 


438 


AB040672 


Homo sapiens 


UDP-GalNAc: polypeptide N- 

acetylgalactosaminyltransfera 

se 


1075 


63 


439 


AF105228 


Bos taurus 


tuftelin 


285 


33 


440 


R064S3 


Homo sapiens 


Derived protein of clone 
ICA13 (ATCC 40553). 


3073 


99 


441 


X14971 


Mus musculus 


alpha-adaptin (A) (AA 1-977) 


4897 


98 


442 


X53773 


Rattus 
norvegicus 


alpha-c large chain (AA 1- 
938) 


3979 


81 


443 


Y66689 


Homo 
sapiens 


Membrane-bound protein 
PR01136. 


3299 


99 


444 


AC067754 


Arabidopsis 
thaliana 


unknown protein; 2034 8-23 707 


114 


33 


445 


AF229032 


Mus musculus 


piL 


2077 


93 


446 


AF056035 


Rattus 
norvegicus 


s-nexilin 


2662 


85 1 


447 


AF132484 


Mus musculus 


unknown 


4 78 


51 


448 


W89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156 . 


528 


45 


449 


AF161445 


Homo sapiens 


HSPC32 7 


1606 


100 


450 


Z68753 


Caenorhabdit 
is elegans 


ZC518.3b 


951 


49 


451 


W39160 


Homo sapiens 


Human partial complement 
factor H protein fragment 3 . 


155 


32 


452 


W85727 


Homo 
sapiens 


Novel protein (Clone 
BM46JL0) - 


2799 


99 


453 


Y^3£29 


Homo sapiens 


A bone marrow secreted 
protein designated BMS115. 


2810 


100 


454 


D87438 


Homo 
sapiens 


Similar to a C. elegans 
protein in cosmid C14H10 


4069 


100 


4 55 


AF240468 


Homo sapiens 


nicastrin 


3687 


100 


456 


Z15005 


Homo sapiens 


CENP-E 


13305 


99 


457 


M59216 


Homo 
sapiens 


gamma- aminobutyric acid 
receptor beta-1 subunit 


2477 


100 


458 


Y73467 


Homo sapiens 


Human secreted protein clone 
yd61 1 protein sequence SEQ 
ID NO: 156. 


966 


100 


459 


W£7824 


Homo sapiens 


Human secreted protein 
encoded by gene IB clone 
HSLFM29 . 


535 


100 


460 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


279 


19 


461 


D87446 


Homo sapiens 


Similar to a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 


9196 


99 


462 


G04044 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8125, 


486 


93 


463 


AC002398 


Homo sapiens 


F25965_l 


1018 


100 


464 


AF0*4856 


Rattus sp. 


7acomp protein 


1845 


84 


465 


AF223408 


Homo sapiens 


B99 


3686 


99 
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466 


AF223408 


Homo sapiens 


B99 


2878 


87 


467 


AF104415 


Kus musculus 


gene trap locus- 13 


6336 


91 


468 


U53450 


Rattus 
norvegicus 


Jun dimerization protein 1 
JDP-i 


196 


49 


469 


AL031297 


Homo sapiens 


dJ97P20.1 (novel gene) 


3564 


99 


470 


AF257077 


Homo sapiens 


eukaryotic translation 
initiation factor EIF2B 
subunit 3 


1274 


95 


471 


L28125 


Podospora 
anserina 


beta transducin-like protein 


284 


38 


472 


Y84903 


Homo sapiens 


A human proliferation and 
apoptosis related protein. 


2337 


100 


473 


AF144237 


Homo sapiens 


LOMP protein 


252 


44 


474 


Y71213 


Homo sapiens 


Human irritable bowel disease 
related polypeptide IMX39. 


838 


100 


475 


Y95006 


Homo sapiens 


Human secreted protein 
vel3_l, SEQ ID NO: 52. 


3411 


100 


476 


D3B549 


Homo sapiens 


hal025 is new 


6533 


99 


477 


AF241230 


Homo sapiens 


TAK1 -binding protein 2 


3656 


100 


478 


AL031534 


Schizosaccha 

romyces 

porrtbe 


putative asparagine synthase 


482 


40 


479 


L28125 


Podospora 
anserina 


beta transducin-like protein 


233 


26 


480 


AF161544 


Homo sapiens 


HSPC059 


434 


77 


481 


AJ23 824 8 


Homo sapiens 


centaurin beta2 


3986 


99 


482 


Z38061 


Saccharomyce 
s cerevisiae 


mal5, stal, len: 1367, CAI : 
0.3, AMYH_YEAST P08640 
GLUCQAM YLAS E SI (EC 3.2.1.3) 


295 


23 


483 


AF161381 


Homo sapiens 


HSPC263 


1404 


100 


484 


AF223468 


Homo sapiens 


AD021 protein 


1314 


100 


486 


X57527 


Homo sapiens 


alpha l(VIII) collagen 


4166 


99 


4B7 


Y19062 


Homo sapiens 


39k3 protein 


2475 


100 


488 


Y73373 


Homo sapiens 


HTRM clone 921803 protein 
sequence. 


555 


56 


489 


AL021918 


Homo 
sapiens 


b34I8.1 (Kruppel related Zinc 
Finger protein 184) 


4184 


100 


490 


X53773 


Rattus 
norvegicus 


alpha -c large chain (AA 1- 
938) 


4675 


97 


491 


U52426 


Homo sapiens 


GOK 


1459 


59 


492 


AL359773 


Leishmania 
major 


possible threonine synthase 


702 


45 


493 


AF226614 


Homo sapiens 


ferroportinl 


2929 


100 


494 


Z93241 


Homo sapiens 


dJ222E13.1 (novel protein 
with some similarity to 
Drosophila KRAKEN) 


513 


96 


495 


AF036977 


Homo sapiens 


unknown 


1812 


100 | 


496 


U93564 


Homo sapiens 


p40 


133 


45 | 


497 


Y91405 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 2 
SEQ ID NO: 126. 


357 


100 


496 


AF069781 


Drosophila 
melanogaster 


Bem46-like protein 


653 


43 


499 


Y16601 


Homo sapiens 


Human cell- cycle 
phosphoprotein CECYP-2. 


1658 


98 


500 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


3883 


100 


501 


AF027503 


Mus 

musculus 


putative membrane -associated 
guanylate kinase 1 


205 


36 


502 


AF282874 


Homo sapiens 


nectin 3; PRR3 1 


2856 


99 


503 


AJ249732 


Homo sapiens 


G8 protein 


669 


100 


504 


AF208861 


Homo sapiens 


BM-019 


1629 


100 


505 


L09708 


Homo sapiens 


complement component C2 


4022 


100 


507 


X66285 


Mus musculus 


HC1 ORF 


115 


43 


508 


D00189 


Rattus 
norvegicus 


Na+ , K+-ATPase alpha-subunit 


5227 


99 
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509 


Y94971 


Homo sapiens 


Human secreted protein clone 
fal71_l protein sequence SEQ 
ID NO: 14 8 ; 


2176 


100 


510 


AB019038 


Homo sapiens 


beta- 1,4 mannosyl transferase 


781 


77 


511 


AB019038 


Homo sapiens 


beta-1,4 mannosyl transferase 


1347 


100 


512 


AB019038 


Homo sapiens 


beta- 1,4 mannosyl transferase 


1520 


99 


513 


X84908 


Homo sapiens 


phosphorylase kinase 


5729 


99 


514 


X52851 


Homo sapiens 


peptidylprolyl isomerase 


650 


76 


515 


AF1B6084 


Homo 
sapiens 


epidermal growth factor 
repeat containing protein 


3046 


99 


516 


G03602 


Horao sapiens 


Human secreted protein, SEQ 
ID NO: 7683. 


505 


99 


517 


U04706 


Bos taurue 


50 kDa protein 


1749 


77 


518 


G00653 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 4734. 


530 


100 


519 


AF161475 


Homo sapiens 


HSPC126 


1368 


100 


520 


Y9936* 


Homo sapiens 


Human PR01475 (UNQ746) amino 
acid sequence SEQ ID NO; 88. 


3394 


97 


521 


AF266852 


Homo sapiens 


PTPLA 


1295 


100 


522 


AE000995 


Archaeoglobu 
s fulgidus 


chromosome segregation 
protein (smcl) 


153 


20 


523 


AF062249 


Homo sapiens 


immunoglobulin heavy chain 
variable region 


605 


97 


524 


AJ223830 


Rattus 
norvegicus 


ARE1 


2950 


98 


525 


W01535 


Homo sapiens 


Cellular homologue of the 
SV40 large T antigen. 


1276 


83 


526 


AF145658 


Drosophila 
melanogaster 


BcDNA . 6H1 0229 


320 


33 


527 


AF112213 


Homo sapiens 


putative Rab5- interacting 
protein 


524 


79 


52B 


D49387 


Homo 
sapiens 


NADP dependent leukotriene b4 
12 -hydroxydehydrogenase 


1*1* 


100 


529 


Y30819 


Homo sapiens 


Human secreted protein 
encoded from gene 9 . 


328 


32 


530 


AL079335 


Homo sapiens 


dJ132F21.3 (72.1 KDa protein 
(DKFZP564A032, SBBI88J 
similar to mouse IFN-gamma 
induce MG11. ) 


1059 


99 


531 


Y91506 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 56 
SEQ ID NO: 179. 


1159 


98 


532 


X76116 


Caenorhabdit 
is elegans 


carrier protein (c2) 


576 


50 


533 


X76116 


Caenorhabdit 
is elegans 


carrier protein <c2) 


506 


50 


534 


X12966 


Homo sapiens 


3-oxoacyl-CoA thiolase 
propeptide (424 AA) 


1972 


100 


535 


Y09267 


Homo sapiens 


flavin- containing 
monooxygenase 2 


246* 


100 


536 


Z11773 


Homo sapiens 


SRE-ZBP 


2201 


99 


537 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


4741 


99 


538 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


3887 


99 


539 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


2933 


96 


540 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


4529 


99 


541 


J03244 


Bos taurus 


H+ ATPase 31kDa subunit (EC 
3.6.1.3) 


848 


77 


542 


Y92514 


Homo sapiens 


Human OXRE-11. 


23 01 


99 


543 


AF221712 


Homo 
sapiens 


Smad- and Olf -interacting 
zinc finger protein 


2151 


61 


544 


AE000919 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


207 


38 


545 


A06669 


synthetic 
construct 


preTGF-betal 


2070 


99 
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546 


Y02698 


Homo sapiens 


Human secreted protein 
encoded by gene 49 clone 
HTPCS60 . 


854 


98 


547 


AF112205 


Homo sapiens 


WSB-1 protein 


2275 


100 


548 


X60271 


Mus musculus 


c-rel 


2264 


74 


549 


AC016827 


Arabidopsis 
thaliana 


putative GTPase 


810 


42 


"550 


Y70400 


Homo 
sapiens 


Human cell- signalling 
protein-2 . 


429 


68 


551 


AB048365 


Homo sapiens 


NEDD4-like ubiquitin ligase 1 


8290 


99 


552 


Y57880 


Homo sapiens 


Human transmembrane protein 
HTMPN-4 . 


1112 


95 


'553 


AF119855 


Homo sapiens 


PR01847 


265 


67 


554 


M17236 


Homo sapiens 


MHC HLA-DQ alpha precursor 


1332 


100 


555 


AL078468 


Arabidopsis 
thaliana 


putative protein 


540 


40 


556 


AC006963 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 
<PlD:g4650844) 


515 


44 


557 


AK02448 7 


Homo sapiens 


FLJ00086 protein 


1623 


98 


558 


M12140 


Homo sapiens 


pol gene protein; Xxx 


117 


48 


559 


W74825 


Homo sapiens 


Human secreted protein 
encoded by gene 97 clone 
HAQBF73 . 


225 


56 


560 


X56681 


Homo sapiens 


junD protein 


373 


88 


561 


AF003136 


Caenorhabdit 
is elegans 


contains weak similarity to 
an AMP-binding motif 


2926 


54 


562 


AL109839 


Homo sapiens 


dJ1069P2.3.1 (novel PABPC1 
(poly (A) -binding protein) 


877 


100 


563 


AF181640 


Drosophila 
melanogaster 


BcDNA. GH09817 


289 


42 


564 


AF052723 


Feline 

leukemia 

virus 


gag-pol precursor polyprotein 
gPr80 


1547 


43 


565 


AF161472 


Homo sapiens 


HSPC123 


439 


44 


566 


Y28617 


Homo sapiens 


pt326_4 secreted protein. 


3338 


100 


567 


U09848 


Homo sapiens 


zinc finger protein 


1738 


100 


569 


AF155113 


Homo sapiens 


NY -REN- 55 antigen 


3603 


93 


57D 


AF155113 


Homo sapiens 


NY-REN-55 antigen 


3951 


99 


571 


AL032821 


Homo sapiens 


dJ55C23.1 (vanin 1) 


1821 


98 


572 


M69181 


Homo sapiens 


non-muscle myosin B 


7350 


99 


573 


M69181 


Homo sapiens 


non-muscle myosin B 


7311 


98 


574 


Y59678 


Homo sapiens 


Secreted protein 108-008-5-0- 
E6-FL . 


772 


100 


575 


AL365234 


Arabidopsis 
thaliana 


putative protein 


788 


40 


576 


AL365234 


Arabidopsis 
thaliana 


putative protein 


788 


40 


577 


X06745 


Homo sapiens 


DNA polymerase alpha -subunit 
(AA 1 - 1462) 


7619 


99 


578 


AB041642 


Homo sapiens 


PAR- 6 


1342 


100 


579 


D86984 


Homo sapiens 


similar to yeast adenylate 
cyclase (S56776) 


2446 


100 


580 


AF165124 


Homo sapiens 


gamma- aminobutyric acid A 
receptor gamma 2 


2499 


99 


581 


W88812 


Homo sapiens 


Polypeptide fragment encoded 
by gene 58. 


2339 


99 


582 


U82319 


Homo sapiens 


novel ORF 


342 


100 


583 


P92219 


Homo sapiens 
(human) 


CR1 protein. 


11425 


99 


564 


AJ22394B 


Homo sapiens 


RNA helicase 


6608 


99 


585 


Y08612 


Homo sapiens 


88kDa nuclear pore complex 
protein 


3874 


99 


566 


Y42384 


Homo 
sapiens 


Amino acid sequence of 
Iv3l0 7. 


1007 


37 


587 


AF129756 


Homo sapiens 


BAT4 


1873 


98 
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588 


AF131775 


Homo sapiens 


Unknown 


1929 


99 


589 


AJ250865 


Homo sapiens 


TESS 2 


2348 


100 


591 


Z988B5 


Homo sapiens 


dJ522J7.2 (bromodomain- 
containing i (similar to 
peregrin, BR140) ) 


4167 


100 


592 


L76571 


Homo sapiens 


nuclear hormone receptor 


1355 


ioo ; 


593 


AF091622 


Homo sapiens 


PHD finger protein 3 


9054 


100 


594 


X56B07 


Homo sapiens 


desmocollin type 2a 


4443 


100 


595 


AL137802 


Homo sapiens 


CU798A10.1 (novel protein) 


212 


55 


596 


AL022329 


Homo 
sapiens 


DK407F11.2 (adrenergic, beta, 
receptor kinase 2) 


3653 


100 


597 


AF226048 


Homo sapiens 


GL003 


2009 


99 


598 


AJ278112 


Homo 
sapiens] 
>Y49635 
Y49635 21- 
OCT-1999 15- 
APR-1998 
Human sdp3 . 5 
protein. 
[Homo 
sapiens 


putative cell cycle control 
protein 


335 


23 


599 


Y59741 


Homo sapiens 


Human normal ovarian tissue 
derived protein 18. 


1574 


99 


600 


L36531 


Homo sapiens 


integrin alpha 8 subunit 


5386 


99 


601 


Y3845B 


Homo sapiens 


Human secreted protein 
encoded by gene No. 20. 


895 


100 


602 


AF218584 


Homo sapiens 


GGA1 


3265 


100 


603 


Y13115 


Homo sapiens 


serine/threonine protein 
kinase 


5071 


99 


604 


AL132776 


Homo sapiens 


dJ393D12.1 (KIAA0776) 


2413 


99 


605 


AL034452 


Homo sapiens 


dJ682J15.1 (novel Collagen 
triple helix repeat 
containing protein) 


1979 


100 


606 


Y14494 


Homo sapiens 


aralarl 


3465 


99 


607 


AJ001981 


Homo sapiens 


OXA1L 


2603 


100 


608 


X86098 


Homo 
sapiens 


binds directly to adenovirus 
type 5 E1A protein 


30^9 


100 


610 


AF163572 


Homo sapiens 


Forssman glycol ip id 
synthetase 


1865 


99 


611" 


AF161503 


Homo sapiens 


HSPC154 


1261 


97 


612 


L41834 


Ensis minor 


nuclear protein 


345 


30 


613 


Y91954 


Homo sapiens 


Human cytoskeleton associated 
protein 9 (CYSKP-9) . 


3668 


100 


614 


AL022327 


Homo sapiens 


dJ355C18.1 (KIAA0027) 


361 


94 


615 


X85766 


Homo sapiens 


binding regulatory factor 


3203 


100 


616 


Y08319 


Homo sapiens 


kinesin-2 


3487 


99 


617 


D12644 


Mus musculus 


KIF2 protein 


3609 


97 


618 


U28789 


Mus musculus 


PACT 


5936 


89 


619 


Y35914 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
163. 


1684 


99 


620 


AB046382 


Mus musculus 


testis-abundant finger 
protein 


199 


23 


621 


YO0OS2 


Homo sapiens 


precursor polypeptide (AA -23 
to 1120) 


3440 


99 


622 


AF068286 


Homo sapiens 


HDCMD38P 


861 


100 


623 


X98248 


Homo sapiens 


sortilin 


4436 


99 


624 


X61100 


Homo sapiens 


75 kDa subunit NADH 
dehydrogenase precursor 


3734 


99 


625 


S5B544 


Homo sapiens 


75 kda infertility-related 
sperm protein 


2125 


99 


626 


AF151027 


Homo sapiens 


HSPC193 


582 j 


93 


627 \ 


X14968 


Homo sapiens 


RH-alpha subunit (AA 1-404) 


2079 j 


100 


628 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7_l derived protein 


1983 


100 
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629 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7_l derived protein 


1694 


100 


630 


AF098786 


Homo 
sapiens 


17 beta-hydroxysteroid 
dehydrogenase type VII 


1754 


100 


631 


AL034555 


Homo 
sapiens 


dJ134019.3 (zinc finger 
protein 151 (pHZ-67)} 


4273 


100 


632 


W74826 


Homo sapiens 


Human secreted protein 
encoded by gene 98 clone 
HAQBT94 . 


794 


96 


633 


AF288288 


Homo sapiens 


HPT protein 


2236 


.100 


634 


AF041429 


Homo sapiens 


pRGRl 


823 


99 


635 


X66357 


Homo sapiens 


serine/threonine protein 
kinase 


1589 


100 


636 


Y11284 


Homo sapiens 


AFX1 


2571 


98 


637 


AB004884 


Homo sapiens 


PKU- alpha 


3718 


99 


638 


AJ002303 


Homo sapiens 


synaptogyrin lc 


1020 


100 


639 


AJ002304 


Homo sapiens 


synaptogyrin lb 


1002 


100 


640 


AJ002303 


Homo sapiens 


synaptogyrin lc 


933 


94 


641 


D87682 


Homo sapiens 


similar to a C.elegans 
protein encoded .in cosmid 
T26A5. 


2676 


100 


642 


M14660 


Homo sapiens 


ISG-K54 


2473 


99 


643 


X06661 


Homo sapiens 


calbindin (AA 1-261) 


1358 


100 


644 


AF119900 


Homo sapiens 


PR0282 2 


185 


"76 


645 


AB031048 


Drosophila 
melanogaster 


microtubule associated- 
protein orbit 


738 


27 


646 


AF250842 


Drosophila 
melanogaster 


multiple asters 


834 


29 


647 


X86691 


Homo sapiens 


Mi-2 protein 


10110 


99 


648 


U67934 


Homo sapiens 


44.9 kDa protein C18B11 
homo log 


827 


96 


649 


AF236061 


Oryctolagus 
cuniculus 


RING-finger binding protein 


3830 


91 


650 


AL034553 


Homo sapiens 


dJ914P20.2 (KIAA0784 protein 
similar to Mus musculus 
ac t ivi ty- dependent 
neuroprotective protein 
(Adnp) ) 


5708 


100 


653 


X14766 


Homo sapiens 


GABA-A receptor alpha 1 
subunit 


2388 


99 


654 


AC004614 


Homo sapiens 


similar to f-spondin proteins 
AB006086 (PID:g2529225) 


3026 | 


99 


655 


Y57908 


Homo sapiens 


Human transmembrane protein 
HTMPN-32. 


608 


99 


656 


234975 


Homo sapiens 


ldlCp 


3733 


100 


658 


AL050306 


Homo sapiens 


dJ475B7.2 (novel protein) 


1942 


99 


659 


W76734 


Homo 
sapiens 


Human niDia Rho targeting 
protein. 


781 


34 


660 


AF202724 


Homo sapiens 


Sadl unc-84 domain protein 1 


2172 


100 


661 


Z21966 


Homo sapiens 


mPOU homeobox protein 


1529 


100 


662 


AJ242954 


Mus musculus 


dysferlin 


4752 


59 


663 


AF182316 


Homo sapiens 


myoferlin 


6232 


99 


665 


AL161516 


Arabidopsis 
thaliana 


hypothetical protein 


209 


30 


667 


X59303 


Homo sapiens 


valyl-tRNA synthetase 


3393 


99 [ 


668 


Y1335S 


Homo sapiens 


Amino acid sequence of 
protein PRO220. 


3692 


100 


669 


AB010692 


Arabidopsis 
thaliana 


contains similarity to endo- 

beta-N-acetylglucosaminidase 

gene 


611 


52 


671 


X56123 


Mus musculus 


talin 


4474 


76 


672 


AB039371 


Homo sapiens 


mitochondrial ABC transporter 
3 


2902 


99 


673 


AF269223 


Homo sapiens 


TCP11 


806 


42 


674 


AF229633 


Mus musculus 


groucho-related protein 4 


4053 


99 


675 


L14463 


Rattus 


' transducin 


3 619 '"■ 


92 
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norvegicus 










AC005757 


Homo sapiens 


R32<Sll 1 


2779 


100 


677 


S61069 


Homo sapiens 


reverse transcriptase 
homolog=pol {retroviral 
element} 


252 


65 


678 


AF271388 


Homo sapiens 


~CMP-N-acetylneuraminic acid 
synthase 


2273 


100 


679 


X79066 


Homo sapiens 


ERF-1 


1783 


100 


680 


AF118566 


Mus musculus 


hematopoietic zinc finger 
protein 


769 


50 


681 


Y51415 


Homo 
sapiens 


Human wild type pKe83 
protein. 


2^21 


99 


682 


AL133545 


Homo sapiens 


bA386N14.1 (novel protein 
similar to a dual specificity 
phosphatase) 


700 


68 


683 


YS6214 


Homo sapiens 


Nuclear transport protein 
clone hfb34l protein 
sequence. 


5888 


99 


684 


Y94952 


Homo sapiens 


Human secreted protein clone 
fhll6_ll protein sequence 
SEQ ID NO: 110. 


354 


98 


685 


AL021B78 


Homo sapiens 


dJ257I20.4 (transcription 
factor 20 (AR1) (KIAA0292) 
(isoform 2) ) 


154 


67 


686 


AE000198 


Escherichia 
coli 


orf, hypothetical protein 


628 


100 


687 


M58378 


Homo sapiens 


synapsin I . 


3730 


99 


688 


AF039697 


Homo sapiens 


antigen NY-CO- 31 


508 


98 


689 


U09355 


Oryctolagus 
cuniculus 


protein phosphatase 2A1 B 
gamma subunit 


2356 


99 


690 


AF155106 


Homo sapiens 


NY-REN- 3 6 antigen 


265 


50 


691 


AC004774 


Homo sapiens 


Dlx-5 


1542 


100 


692 


X90530 


Homo sapiens 


ragB 


1926 


99 


693 


X90530 


Homo sapiens 


ragB 


1405 


99 


694 


X90530 


Homo sapiens 


ragB 


1590 


85 


695 


G01563 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5644. 


330 


100 


696 


AC011810 


Arabidopsis 
thaliana 


Putative methionine 
aminopeptidase 


669 


52 


697 


AJ250425 


Rattus 
norvegicus 


Collybistin I 


2455 


98 


698 


AB037901 


Homo 
sapiens 


gene amplified in squamous 
cell carcinoma- 1 


5364 


99 


6$$ 


Y99461 


Homo sapiens 


Human PR01327 (UNQ^87) amino 
acid sequence SEQ ID NO:218. 


1386 


100 


701 


AF221712 


Homo 
sapiens 


Smad- and 01 f -interacting 
zinc finger protein 


6705 


100 


702 


X83573 


Homo sapiens 


ARSE 


3184 


99 


703 


AJ243274 


Homo sapiens 


AP-2rep protein 


2078 


99 


704 


Y71262 


Homo sapiens 


Human chondromodulin-like 
protein, Zchml . 


1697 


94 


705 


Y71262 


Homo sapiens 


Human chondromodul in- 1 ike 
protein, Zchml . 


1736 


99 


706 


Y41257 


Homo sapiens 


Amino acid sequence of long 
human FAIM. 


1060 


100 


707 


AL022237 


Homo sapiens 


bK1191B2.3 (PUTATIVE novel 
Acyl Transferase similar to 
C. elegans C50D2.7) (isoform 
1) ) 


2030 


100 


708 


AJ00626£ 


Homo sapiens 


AND-1 protein 


5942 


100 


709 


G01571 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5652. 


777 


99 


710 


Y08698 


Homo sapiens 


ranbp3 j 


2849 


98 


711 


Y68770 


Homo sapiens 


Amino acid sequence oil a i 
human phosphorylation 
effector PHSP-2. 


7S4 


99 
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ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


712 


U93574 


Homo sapiens 


putative pl50 


799 


59 


713 


AC004531 


Homo sapiens 


Gene with similaity to DEAD 
box helicases 


2715 


99 


714 


D89016 


Homo sapiens 


Neuroblastoma 


538 


48 


715 


Y92175 


Homo sapiens 


Human cardiovascular system 
associated protein tyrosine 
phosphatase 2 . 


734 


98 


716 


AL137013 


Homo sapiens 


DA311P8.3 (probable uracil 
phosphoribosyltranf erase) 


862 


100 


717 


AB035123 


Mus mus cuius 


GDI alpha/GTla alpha/GQlb 
alpha synthase 


1696 


93 


718 


Y96290 


Homo >P40254 
P40254 25- 
OCT-1984 09- 
APR-1983 
Human IgD. 
[Homo 
sapiens 


Human IGFAM-2 immunoglobulin. 


2345 


85 


719 


X07979 


Homo sapiens 


integrin beta 1 subunit 
precursor 


4347 


99 


720 


AJ224819 


Homo sapiens 


tumor suppressor 


2149 


99 


721 


Y07595 


Homo sapiens 


transcription factor TFIIH 


2373 


100 


722 


W41565 


Homo 

sapiens] 

>W41564 

W41564 08- 

OCT-1997 05- 

APR-1996 

Human 

calpain. 

[Homo 

sapiens 


Human calpain. 


1591 


99 


723 


AFl£l341 


Homo sapiens 


HSPC078 


1097 


98 


724 


AF187318 


Homo sapiens 


F-box protein Fbx2 


1607 


100 


725 


AC006708 


Caenorhabdit 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB;Z72876) 


1143 


46 


726" 


AC006708 


Caenorhabdit 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB:Z72876) 


988 


46 


727 


AC024818 


Caenorhabdit 
is elegans 


contains similarity to Pfam 
family PF00400 (WD domain, 
G-beta repeat), score=81.8, 
E«1.4e-20, N«3 


950 


44 


728 


AJ005897 


Homo sapiens 


JM5 


831 


47 


729 


Y45377 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
27. 


908 


97 


730 


G03931 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8012. 


578 


100 


731 


AB012720 


Oncorhynchus 
ma sou 


GTP -binding protein 


3865 


76 


732 


W73404 


Homo sapiens 


Human secreted protein 
encoded by Gene No. 8. 


8£2 


97 


733 


G02650 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6731. 


644 


97 


734 


AC024813 


Caenorhabdit 
is elegans 


Hypothetical protein 
Y54F10AL.a 


152 ■ 


24 


735 


AL035461 


Homo sapiens 


dJ967N21.6 (novel CDP-alcohol 
phosphatidyl transferase 
family member protein) 


1562 


98 


736 


U00033 


Caenorhabdit 
is elegans 


similar to S. cerevisiae YJU2 
protein 


605 


41 


737 


AF079098 


Homo 
sapiens 


arginine- tRNA-prot ein 
transferase 1-lp; ATEl-lp 


2733 


99 
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DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


738 


AJ131712 


Homo sapiens 


nucleolar RNA-helicase 


.2793 


100 


739 


AJ133115 


Homo sapiens 


TSC-22-like protein 


2054 


99 


740 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


953 


100 


741 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


564 


74 


742 


U97191 


Caenorhabdit 
is elegans 


strong similarity to the YPT1 
sub- family of RAS proteins 


960 


85 


743 


X76057 


Homo sapiens 


phosphomannose isomerase 


2191 


100 


744 


G03209 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7290. 


496 


98 


745 


X97064 


Homo sapiens 


Sec23 protein 


4034 


99 


74S 


W93946 


Homo sapiens 


Human regulatory molecule 
HRM-2 protein. 


994 


100 


747 


Y733B8 


Homo sapiens 


HTRM clone 3376404 protein 
sequence . 


1565 


99 


748 


M19529 


Sus scrof a 


follistatin A 


1906 


98 | 


749 


AJ249457 


Trichomonas 
vaginalis 


centrin, putative 


183 


28 


750 


AC004410 


Homo sapiens 


fos39554_l 


2094 


100 


751 


AF074968 


Homo sapiens 


P47ING3 protein 


2167 


100 


752 


AF252284 


Homo sapiens 


transcription specificity 
factor Spl 


4005 


100 


753 


AB049629 


Homo sapiens 


phospholysine 

phosphohistidine inorganic 
pyrophosphate phosphatase 


1375 


99 


754 


D79205 


Homo sapiens 


ribosomal protein L3 9 


160 


77 


755 


AB00B430 


Homo sapiens 


CDEP 


142 


29 


758 


L32162 


Homo sapiens 


transcription factor 


574 


80 


759 


AF037204 


Homo sapiens 


RING zinc finger protein 


295 


54 


760 


Y44250 


Homo 
sapiens 


Human cell signalling 
protein-13 . 


625 


100 


761 


AF218586 


Homo sapiens 


Cide-b 


1136 


100 


762 


U38934 


Gallus 
gallus 


histone H2A 


625 


97 


763 


AF226053 


Homo sapiens 


HSKM-B 


606 


32 


764 


X13403 


Homo sapiens 


Oct-1 protein (AA 1 - 743) 


3626 


100 


765 


D87446 


Homo sapiens 


Similar to a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 


568 


38 


766 


AL023828 


Caenorhabdit 
is elegans 


Y17G7B.14 


200 


27 


767 


Y82777 


Homo sapiens 


Human chordin related protein 
{Clone dw665_4) . 


2551 


99 


768 


X92475 


Homo sapiens 


ITBA1 


1429 


100 


769 


Y42752 


Homo sapiens 


Human calcium binding protein 
3 (CaBP-3). 


1426 


100 


770 


X51416 


Homo sapiens 


hormone receptor hERRl (AA 1- 
521) 


2641 


97 


771 


AJ006591 


Homo sapiens 


cysteine-rich protein 


1793 


100 


772 


A08695 


Homo sapiens 


rap2 


935 


100 


773 


Z12173 


Homo sapiens 


N-acetylglucosamine- 6 - 
sulphatase 


2970 


100 


774 


Y91950 


Homo sapiens 


Human cytoskeleton associated 
protein 5 (CYSKP-5) . 


565 


43 


776 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc finger) 


855 


56 


777 


AL023799 


Homo sapiens 


&J322P7.1 (zinc finger) 


855 


56 


778 


G01880 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5961. 


849 


98 


779 


AJ012590 


Homo sapiens 


glucose 1- dehydrogenase 


4155 


99 


780 


AL078582 


Homo sapiens 


dJ130E4.2 (KIAAD796) 


1321 


68 


781 


Z75955 


Caenorhabdit 
is elegans 


similar to mitochondrial 
carrier protein 


384 


34 


782 


AL109965 


Homo 
sapiens 


dJ1121G12.2 (SCAN domain- 
containing 1 protein) 


900 


100 


783 


AF061262 


Mus 

musculus 


semaF cytoplasmic domain 
associated protein 2 


1316 


63 


784 


G03873 


Homo sapiens 


Human secreted protein, SEQ 


649 


95 
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qpRCTES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








ID NO: 7954 . 






785 


Y84441 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


2074 


100 


736 


Y00918 


Homo sapiens 


Human Ran protein, RABP-1, 
protein sequence. 


1048 


99 


787 


Z97029 


Homo sapiens 


ribonuclease HI large subunit 


1548 


99 


788 


AB035384 


Homo sapiens 


SRp25 nuclear protein 


962 


94 


789 


AF024631 


Homo sapiens 


ANG2 


2644 


100 


790 


AJ006710 


Rat tus 
norvegicus 


phosphatidyl inosi tol 3 -kinase 


4508 


97 


792 


V00638 


bacteriophag 
e lambda 


reading frame ealO 


600 


100 


793 


AF049103 


Homo sapiens 


Huntingtin interacting 
protein 


819 


100 


795 


Z26317 


Homo sapiens 


desmoglein 2 


4810 


99 


796 


Y76884 


Homo fianiptifi 


T?f*t* inohl a e t"Oina hindincr 
protein-7secjuence . 


5080 


99 


797 


U15155 


Gallus 
gallus 


trypsinogen 


372 


37 


798 


U97189 


Caenorhabdit 

■? c jo"! ans n c 
x o c icy uu3 


strong similarity to thw 

P1"t/P14 fami 1 v nf kina«?f»K 
r j / ir j. *± LCaiuxiy rvAiiaDco 


227 


28 


799 


AF112201 


•Homo sapiens 


neuronal protein NP25 


1053 


100 


bUU 




D a f Hue 
KatCUS 


serine** arginine-rich splicing 






801 


AF267852 


Homo sapiens 


placental protein 13 -like 


743 


99 


802 


AF208851 


Homo sapiens 


BM-009 


766 


80 


803 




Caenorhabdit 

la £=J.ey « Jli> 


Similarity to Human 

t e t J- noD J. a s l. uriici — d j-xju. x ny 

comes from this gene 




z i 


804 


G02113 


Homo sapiens 


Human secreted protein, SEQ 
ID NO : 6194 . 


496 


98 


805 


AL121S73 


Homo sapiens 


DA305P22.1 (novel protein) 


1160 


100 


806 


AC013483 


Arabidopsis 
thaliana 


putative GTPase activator 
protein 


264 


30 


807 


AC013483 


Arabidopsis 
thaliana 


putative GTPase activator 
protein 


264 


30 


808 


AB013885 


Homo sapiens 


beta-ureidopropionase 


1494 


100 


809 


AF078842 


Homo sapiens 


HOTTL protein 


1581 


99 


810 


AF161421 


Homo sapiens 


HSPC303 


2134 


96 


811 




no mo Eapicus 


mnl -i rm pra c o «inai 1 nn ril "7 

uvin puiyiucxctsc c^oi^un kjA / 

subunit 




100 


812 


Z74029 


Caenorhabdit 


Similarity to C.elegans 

ax<»uiiuji uciiyuiuy cuasc tunica 

from this gene 


610 


71 


813 


Z73497 


Homo saciens 


CU240C2 2 (Core his tone 
H2A/H2B/H3/H4) 


324 


100 


814 


W87689 


Homo 
sapiens 


Human HTXFT1 9 polypeptide. 


1484 


99 


815 


X16282 


Homo 
sapiens 


zinc finger protein (217 AA) 
(1 is 2nd base in codon) 


1109 


99 


816 


Z92539 


Mycobacteriu 
m 

tuberculosis 


pth 


300 


36 


818 


AB030483 


Mus musculus 


B9 


197 


27 


819 


AL117555 


Homo sapiens 


hypothetical protein 


321 


94 


820 


AC005328 


Homo sapiens 


R26660_2, partial CDS 


865 


97 


821 


G03951 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8032. 


700 


99 


822 


L34807 


Musca 
domestica 


transposase 


174 


20 


823 


G02928 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7009. 


558 


78 


824 


Z99531 


Schizosaccha 


caffeine- induced death 


184 


29 
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SCORE 


k 

IDENTITY 






romyces 
pombe 


protein l 






825 


AJ006692 


Homo sapiens 


ultra high sulfer keratin 


693 


66 


826 


U23037 


Oryctolagus 
cuni cuius 


elF-2Bepsilon 


3406 


90 


827 


G03412 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7493. 


464 


100 


828 


Y30827 


Homo sapiens 


Human secreted protein 
encoded from gene 17. 


113 


44 


829 


Y32199 


Homo sapiens 


Human receptor molecule (REC) 
encoded by Incyte clone 
2022379. 


1012 


100 


830 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33 . 


1264 


99 


832 


AB011542 


Homo sapiens 


MEGF9 


2097 


100 


833 


G02639 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6720. 


223 


70 


834 


AF119664 


Homo sapiena 

- 


transcriptional regulator 
protein HCNGP 


1574 


100 


835 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1144 


89 


836 


AF119^4 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1448 


94 


837 


X12517 


Homo sapiens 


C protein (AA 1-159) 


918 


100 


838 


U32865 


Drosophila 
melanogaster 


linotte protein 


164 


24 


839 


AF067730 


Homo sapiens 


TLS-associated protein TASR-2 


631 


56 ] 


840 


U27831 


Homo sapiens 


striatum-enriched phosphatase 


2840 


98 


841 


AF286366 


Homo sapiens 


CamKI-like protein kinase 


1796 


100 


842 


G02309 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6390. 


278 


98 j 


843 


AE003615 


Drosophila 
melanogaster 


ade3 gene product 


113 


48 


844 


G01350 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5431. 


629 


100 


845 


U27838 


Mus musculus 


glycosyl -phosphatidyl - 
inositol -anchored protein 
homolog 


3305 


96 


847 


Y87788 


Homo sapiens 


Human RBP-26 protein. 


2026 


100 


848 


AF164794 


Homo sapiens 


Diff33 protein homolog 


2398 


100 


849 


U41315 


Homo sapiens 


ZNF127-Xp 


2458 


93 


850 


AF192784 


Homo sapiens 


makorin 1 


2062 


97 


851 


Y58628 


Homo sapiens 


Protein regulating gene 
expression PRGE-21. 


1548 


100 


852 


Z22968 


Homo sapiens 


M130 antigen 


6205 


100 


853 


Z22971 


Homo sapiens 


M130 antigen extracellular 
variant 


6380 


100 


854 


G03342 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443. 


330 


96 


855 


G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443. 


203 


100 


856 


AF2B5118 


Homo sapiens 


CGI-203 


452 


100 


857 


AC006069 


Arabidopsis 
thaliana 


putative cleavage and 
polyadenylation specifity 
factor 


1383 


55 


856 


AL021546 


Homo sapiens 


Cytochrome C Oxidase 
Polypeptide Vla-liver 
precursor (EC 1.9.3.1) 


593 


100 


859 


L02956 


Xenopus 
laevis 


ribonucleoprotein 


1664 


85 


860 


AF201947 


Homo sapiens 


MEK binding partner l 


616 


100 


861 


L31783 


Mus musculus 


uridine kinase 


1266 


92 


862 


AF161472 


Homo sapiens 


HSPC123 


602 


73 


863 


Z49068 


Caenorhabdit 
is elegans 


mitochondrial carrier protein 


370 


43 


864 


AF154108 


Homo sapiena 


tumor necrosis tactor type l 


3559 


99 
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% 
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receptor associated protein 






865 


AE001530 


Helicobacter 
pylori J99 


putative 


230 


32 


866 


X57807 


Homo sapiens 


immunoglobulin lambda light 
chain 


699 


91 




AL031673 


Homo sapiens 


dJ694B14.1 (PUTATIVE novel 
KRAB box protein with 18 C2H2 
type Zinc finger domains) 


4066 


99 


868 


Y11652 


Homo sapiens 


phosphate cyclase 


238 


100 


869 


AF192968 


Homo sapiens 


high-glucose- regulated 
protein 8 


3041 


99 


870 


AB020648 


Homo sapiens 


KIAA0841 protein 


3237 


99 


871 


AL031427 


Horao sapiens 


dJ167A19.1 (novel protein) 


1608 


100 


872 


AF151534 


Horao sapiens 


core histone macroH2A2.2 


1866 


100 


873 


AL021331 


Horao sapiens 


dJ366N23.1 (putative C. 
elegane UNC-93 (protein 1, 
C46F11.1) LIKE protein) 


1129 


100 


874 


X14608 


Homo sapiens 


propionyl-CoA carboxylase 


3579 


100 


875 


AL117334 


Homo sapiens 


dJ687Fll.l (novel protein 
(part of translation of cDNA 
DKFZp434N061, Em:AL110249) ) 


30* 


100 


876 


X79489 


Saccharomyce 
s cerevisiae 


E-925 protein 


446 


35 


877 


Y53001 


Homo sapiens 


Human secreted protein clone 
dn834_l protein sequence SEQ 
ID NO: 8. 


811 


100 


878 


AF281064 


Homo sapiens 


CHMP1 . 5 


957 


100 J 


879 


X79417 


Sua scrofa 


40S ribosomal protein S12 


687 


100 


880 


AF001317 


Saccharomyce 
s cerevisiae 


Soilp 


478 


28 


881 


Y87275 


Homo sapiens 


Human signal peptide 
containing protein HSPP-52 
SEQ ID NO: 52. 


2547 


100 


882 


M14036 


Homo sapiens 


Cl-inhibitor 


598 


77 


883 


AB041261 


Homo sapiens 


calcium- independent 
phospho lipase A2 


2903 


100 


834 


AF020313 


Mus musculus 


proline- rich protein 48 


999 


84 


885 


Y10936 


Homo sapiens 


hypothetical protein 


1104 


99 


886 


AF073997 


Mus musculus 


myotubularin related protein 
1 


866 


36 


887 


Y57893 


Homo sapiens 


Human transmembrane protein 

HTMPN-17. 


1099 


94 


888 


AL11763S 


Homo sapiens 


hypothetical protein 


929 


99 


889 


AF210317 


Homo sapiens 


facilitative glucose 
transporter family member 
GLUT 9 


2046 


99 


890 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


583 


100 


891 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


192 


57 


892 


AF237631 


Homo sapiens 


ubiquitous tropomodulin U- 
Tmod 


1798 


100 


893 


AF090929 


Homo sapiens 


PR00477p 


653 


99 


894 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein 
BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 


3196 


100 


895 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD4 0 protein 
BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F2 8D1.1) 


2825 


96 


896 


AF171102 


Homo sapiens 


retinal degeneration B beta 


1302 


95 


897 


AE003551 


Drosophila 
melanogaster 


CGI 8 17 6 gene product 


633 


33 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


898 


AJ237946 


Homo sapiens 


DEAD Box Protein 5 


2443 


100 


B99 


Z97184 


Homo sapiens 


HKE2 


624 


100 


900 


Z97184 


Homo sapiens 


HKE2 


409 


98 


901 


AJ245587 


Homo sapiens 


Kruppel-type zinc finger 


1942 


100 


902 


AF091034 


Homo sapiens 


GTP-binding protein RAB22A 


1011 


100 


903 


R95953 


Homo sapiens 


Eukaryotic cell growth 
inhibiting factor. 


414 


96 


904 


L04733 


Homo sapiens 


kinesin light chain 


1936 


72 


905 


AE003540 


Drosophila 
melanogaster 


CGI 09 84 gene product 


446 


33 


906 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 


2993 


98 


907 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 


2901 


96 


908 


W84085 


Homo sapiens 


Human membrane fusion protein 
WDProl . 


1889 


100 


909 


AF168676 


Homo 
sapiens 


TNF intracellular domain- 
interacting protein 


647 


100 


910 


AB029150 


Homo sapiens 


KRAB zinc finger protein 
HFB101L 


2196 


100 


911 


G02871 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6952. 


521 


100 


912 


G03162 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7243. 


387 


87 


913 


AJ243721 


Homo 
sapiens} 
>Y92508 
Y92508 13- 
APR-2000 06- 
OCT-1998 
Human OXRE- 
5 . [Homo 
sapiens 


dTDP-4 -keto- 6 -deoxy-D- glucose 
4 -reductase 


1710 


100 


914 


U24189 


Caenorhabdit 
is elegane 


hypothetical protein 1207-1; 
Method : conceptual 
translation supplied by ■ 
authors 


244 

i 


41 


915 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


843 


99 


916 


AE000984 


Archaeoglobu 
s fulgidus 


dinitrogenase reductase 
activating glycohydrolase 
(draG) 


171 


26 


918 


M23159 


Cricetus 
cricetus 


DHFR-coamplif ied protein 


163 


30 


919 


L12018 


Caenorhabdit 
is elegans 


putative 


1232 


41 


920 


AF102177 


Homo sapiens 


tumor antigen SLP-8p 


1260 


97 


921 


AL096712 


Homo sapiens 


dJ744I24.2 (similar to a 
novel human gene mapping to 
Activator) 


1017 


78 


922 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


86£ 


42 


923 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


442 


36 


924 


U97001 


Caenorhabdit 
is elegans 


similar to 

Schizosaccharomyces pombe 


605 


51 


925 


X71978 


Mus mus cuius 


Fif 


1503 


95 


926 


M92288 


Drosophila 
melanogaster 


beta-spectrin 


290 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No . 9 . 


1392 


100 


928 


Y22499 


Homo sapiens 


Human secreted protein 
sequence clone mh703__l. 


2249 


100 


930 


AJ224326 


Homo sapiens 


ribulose- 5 -phosphate- 
epimerase 


912 


100 


931 


U28991 


Caenorhabdit 


coded for by C. elegans cDNA 


660 


5* 
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ID 
NO : 


ACCESSION 
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SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






is elegans 


cm21c7 






932 


AL080065 


Homo sapiens 


hypothetical protein 


210 


25 


933 


G01884 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5965. 


767 


98 


934 


AJ276485 


Homo sapiens 


integral membrane transporter 
protein 


1200 


100 


935 


AL035681 


Homo sapiens 


dJ756G23.3 (novel protein 
similar to drosophila 
transcriptional repressor) 


1142 


80 


936 


AB026808 


Mus musculus 


synaptotagmin XI 


2142 


95 


937 


AB015345 


Homo sapiens 


HRIHFB2216 


2601 


99 


938 


X65724 


Homo sapiens 


0RF2 


498 


100 


939 


W89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156. 


1487 


100 


940 


G04047 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8128. 


117 


100 


941 


AF094583 


Homo sapiens 


putative HIV-1 infection 
related protein 


452 


100 


942 


AC024200 


Caenorhabdit 
is elegans 


contains similarity to 
several zinc finger proteins 
but not to the zinc finger 
domains 


350 


69 


943 


AF129756 


Homo sapiens 


G5c 


273 


100 


944 


M23765 


Rattus 
norvegicus 


alpha- tropomyosin 


133 


96 


945 


AC009917 


Arabidopsis 
thaliana 


Contains similarity to 


583 


"47 


946 


AF22346S 


Homo sapiens 


AD021 protein 


551 


44 


947 


AF055473 


Homo sapiens 


GAGE- 8 


273 


51 


94 8 


X75756 


Homo sapiens 


protein kinase C mu 


2019 


68 


949 


AF143 956 


Mus musculus 


coronin-2 


2300 


93 


950 


Y36729 


Homo 
sapiens 


Human PG1 protein sequence. 


1861 


99 


951 


W49041 


Homo sapiens 


Human low density lipoprotein 
binding protein LBP-2. 


282 


67 


952 


AB016881 


Arabidopsis 
thaliana 


gene_id:MXC17.7~ 


203 


46 


953 


Y01785 


Homo sapiens 


Human ubiqui tin -conjugating 
enzyme >Y25341 Y25341 01-JUL- 
1999 12-AUG-1998 Human NCE-2 
protein. 


3^5 


100 


954 


AF145615 


Drosophila 
melanogaster 


BCDNA.GH03377 


823 


45 


955 


U09410 


Homo sapiens 


zinc finger protein ZNF131 


2483 


99 


956 


U09410 


Homo sapiens 


zinc finger protein ZNF131 


1853 


99 


957 


AF195623 


Homo sapiens 


cholinephosphotransf erase 1 
alpha 


2126 


99 


958 


X94917 


Drosophila 
melanogaster 


head-elevated expression in 
0.9 kb 


155 


32 


959 


U54807 


Rattus 
norvegicus 


GTP -binding protein 


1167 


97 


960 


AF058807 


Bos taurus 


GTP -binding protein rah 


606 


97 


961 


G03244 t 


Homo sapiens 


Human secreted protein, SEQ 

ID NO: 7325. 


471 


100 


962 


AF078850 


Homo sapiens 


steroid dehydrogenase homolog 


583 


40 


963 


AP001754 


Homo sapiens 


transient receptor potential - 
related channel 7, a novel 
putative Ca2+ channel protein 


317 


30 


964 


AL035419 


Homo sapiens 


dJ1100H13.1 (putative novel 
protein) 


1129 


100 


965 


X61381 


kattus 
rattus 


interferon- induced protein 


202 


46 


966 


D38169 


Homo 
sapiens 


inositol 1,4, 5-trisphosphate 
3 -kinase isoenzyme 


3278 


100 


967 


AL031432 


Homo 
sapiens 


dJ465N24.2.1 (PUTATIVE novel 
protein) (isoform 1) 


893 


100 
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SMITH- 
WATERMAN 
SCORE 


IDENTITY 


968 


U79275 


Homo sapiens 


unknown 


611 


100 


969 


AJ011306 


Homo 
sapiens 


guanine nucleotide exchange 
factor (long isoform) 


2752 


99 


970 


AF281134 


Homo sapiens 


exosome component Rrp46 


1186 


100 


971 


U53336 


Caenorhabdit 
is elegans 


weak similarity over a short 
region to myosin heavy chain 


536 


23 i 


972 


AC018749 


Leishmania 
major 


L8840.12 


589 


53 


973 


AP188504 


Mus musculus 


LNV 


544 


85 


974 


U25801 


Homo sapiens 


Taxi binding protein 


852 


98 


975 


AF049523 


Homo sapiens 
1 


hunt ingt in- interacting 
protein HYPA/FBP11 


1390 


97 


976 


AF161530 


Homo sapiens 


HSPC182 


1040 


100 


977 


G04020 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8101. 


£26 


100 


,978 


AF164797 


Homo sapiens 


ribosomal protein L17 isolog 


908 


100 


979 


U94991 


Xenopus 
laevis 


transcription factor XLMOl 


795 


97 


980 


S73775 


Homo sapiens 


calmitine; calsequestrine 


2029 


100 


981 


Y94888 


Homo 
sapiens 


Human protein clone HP01462. 


2501 


100 


982 


AJ243191 


Homo sapiens 


heat shock protein 


827 


96 


983. 


X65020 


Bos taurus 


PSST subunit of the NADH: 
ubiquinone oxidoreductase 
complex 


964 


85 


984 


AJ249207 


Rhodococcus 
sp . AD4 5 


putative racemase 


351 


43 


985 


Z30093 


Homo sapiens 


basic transcription factor 2, 
35 kD subunit 


1576 


99 


986 


AB030B35 


Homo sapiens 


contains two glutamine rich 
domains, three zinc- finger 
domains, and matrin 3 
homologous domain 3 (MH3) 


4^97 


99 


987 


AF227258 


Bos taurus 


RPGR-interacting protein-1 


1262 


38 


988 


AL02223B 


Homo sapiens 


dJ1042K10.2 (supported by 
GENS CAN, FGENES and GENEWISE) 


4048 


99 


989 


AL022238 


Homo sapiens 


dJ1042K10.2 {supported by 
GENS CAN, FGENES and GENEWISE) 


2321 


99 


990 


AF161426 


Homo sapiens 


HSPC308 


448 


92 


991 


AF161426 


Homo sapiens 


HSPC308 


448 


92 


992 


AF161426 


Homo sapiens 


HSPC308 


453 


92 


993 


AL023859 


Schizosaccha 

romyces 

pombe 


trna- splicing endonuclease 
subunit 


172 


42 


994 


AL049631 


Homo sapiens 


dJ5l3M9.l (novel Homeobox 
domain protein) 


241 


47 


995 


AC005253 


Homo sapiens 


R26445_l 


902 


100 


996 


AF265206 


Homo sapiens 


M0G1 isoform A 


974 


100 


997 


AJ248285 


Pyrococcus 
abyssi 


sar cosine oxidase, subunit 
beta (soxB) 


195 


28 


998 


AE003641 


Drosophila 
melanogaster 


BG:DS00941.3 gene product 


218 


58 


999 


W69343 


Homo 
sapiens 


Secreted protein of clone 
CR930_1. 


1340 


98 


1000 


AY007135 


Homo sapiens 


similar to bovine ADP/ATP 
translocase Tl mRNA with 
GenBank Accession Number 
M24102.1 


1543 


100 


1001 


Y73381 


Homo sapiens 


HTRM clone 1877278 protein 
sequence . 


1668 


100 


1002 


AF208844 


Homo sapiens 


BM-002 


428 


100 


1003 


AE004944 


Pseudomonas 
aeruginosa 


hypothetical protein 


134 


35 


1004 


AL031431 


Homo sapiens 


dJ462023.2 (novel protein) 


2058 


100 


1005 


S45367 


Canis 

familiaris 


centractin 


1949 


100 
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SCORE 


% 
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1006 


S45367 


Can is 

farailiaris 


centractin 


1315 


98 


1007 


AB022158 


Mus 

musculus 


chaperonin containing TCP-1 
epsilon subunit 


2649 


96 


1008 


Y76332 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 38. 


1282 


97 


1009 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1010 


Z68218 


Caenorhabdit 
is elegans 


K01H12.1 


269 


67 


1011 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1012 


Z14000 


Homo sapiens 


RING1 


2017 


100 


1013 


G02841 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6922. 


332 


93 


1014 


AF145659 


Drosophila 
melanogaster 


BCDNA.GH10333 


1244 


52 


1015 


Y02860 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


664 


67 


1016 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


772 


97 


1017 


Y99448 


Homo sapiens 


Human PR01759 (UNQ832) amino 
acid sequence SEQ ID NO: 3 74. 


2323 


100 


1018 


X67250 


Rattus 
norvegicus 


n-chimaerin 


1710 


97 


1019 


AF183417 


Homo 
sapiens 


microtubule- associated 
proteins 1A/1B light chain 3 


£31 


100 


1020 


AF164795 


Homo sapiens 


sex- regulated protein j anus- a 


674 


100 


1021 


AF190625 


Coturnix 
coturnix 


qdgl-1 


638 


96 


1022 


AL133363 


Arabidopsis 
thaliana 


putative protein 


155 


37 


1023 


AB034912 


Homo sapiens 


WD- repeat like sequence 


2483 


100 


1024 


AY007091 


Homo sapiens 


similar to Homo sapiens 
mammalian inositol 
hexakisphosphate kinase 2 
(IP6K2) mRNA with Ge 


2243 


100 


1025 


X69910 


Homo sapiens 


P63 protein 


2958 


99 


1026 


U8073€i 


Homo sapiens 


CAGF9 


1657 


100 


1027 


AB029333 


Halocynthia 
roretzi 


HrPET-1 


1048 


54 


1028 


AB032931 


Homo sapiens 


ubiqu it in -conjugating enzyme 
isolog 


1045 


100 


1029 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1030 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1031 


AF193 795 


Homo sapiens 


vacuolar sorting protein 
VPS29/PEP11 


960 


100 


1032 


AJ222968 


Mus musculus 


L-periaxin 


120 


30 


1033 


Z81317 


Schizosaccha 

romyces 

pombe 


DNA2-NAM7 helicase family 
protein 


685 


31 


1034 


Y41519 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 75. 


1321 


99 


1035 


AJ276004 


Mus musculus 


Paxneb protein 


1709 


77 


1036 


AF025459 


Caenorhabdit 
is elegans 


H14A12 . 3 gene product 


190 


30 


1037 


U37251 


Homo sapiens 


Description: KRAB zinc finger 
protein; this is a splicing 
supplied by author 


196 


43 


1038 


W74580 


Homo 
sapiens 


Human membrane protein 
BA0306. 


1921 


97 


1039 


U88173 


Caenorhabdit 
is elegans 


weak similarity to 
Arabidopsis thaliana 
ubiqui tin- like protein 8 


331 


80 
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1040 


AF290204 


Homo sapiens 


blood group carrier molecule 
D0K1 


1637 


99 


1041 


Y96730 


Homo 
sapiens 


PROS39, a Costal -2 homologue. 


162 


22 


1042 


AF140683 


Mus mus cuius 


F-box protein FWD2 


2397 


98 


1043 


AF151023 


Homo sapiens 


HSPC189 


1104 


100 


1044 


AF181631 


Drosophila 
melanogaster 


BCDNA.GH04929 


204 


37 


1045 


Y77985 


Homo sapiens 


Human collectin amino acid 
sequence . 


1940 


100 


1046 


AJ243972 


Homo sapiens 


6 -phosphogluconolactonase 


1317 


100 


1047 


AB035863 


Homo sapiens 


ATP specific succinyl CoA 
synthetase beta subunit 
precursor 


2324 


99 


1048 


AL034550 


Homo sapiens 


dJ1184F4.2 (novel protein 
similar to nucleolar protein 
4 (NOL4) (NOLP)) 


981 


92 


1049 


AF163825 


Homo sapiens 


pre-B lymphocyte protein 3 


634 


100 


1050 


AF201949 


Homo sapiens 


60S ribosomal protein L30 
isolog 


868 


100 


1051 


AF190624 


Mus musculus 


mdgl-l 


236 


85 


1052 


AE003529 


Drosophila. 
melanogaster 


CG6151 gene product 


160 


44 


1053 


G0U91 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5272. 


646 


98 


1054 


AL162756 


Neisseria 
meningitidis 


Glu-tRNA(Gln) " 
amidotransf erase subunit A 


682 


44 


1055 


AF181856 


Rattus 
norvegicus 


tRNA selenocysteine 
associated protein 


1525 


99 


1056 


U89649 


Chlamydomona 
s 

reinhardtii 


Mrl9,00Q outer arm dynein 
light chain 


244 


34 


1057 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


663 


53 


1058 


AF230929 


Homo 
sapiens 


keratinocyte annexin-like 
protein pemphaxin 


1710 


99 


1059 


AJ270952 


Homo sapiens 


putative membrane protein 


1363 


100 


1060 


AF224263 


Heterodontus 
f rancisci 


HoxD8 


742 


83 


1061 


X63417 


Homo sapiens 


IRLB 


1037 


100 


1062 


AL079345 


Streptomyces 
coelicolor 
A3 (2) 


hypothetical protein 

• 


143 


27 


1063 


Y71112 


Homo sapiens 


Human Hydrolase protein- 10 
(HYDRL-10) . 


2547 


100 


1064 


AF263614 


Homo sapiens 


acetyl -CoA synthetase 


3493 


99 


1065 


Y13356 


Homo sapiens 


Amino acid sequence of 
protein PR0221. 


1363 


100 


1066 


AC006153 


Homo sapiens 


similar to Aquifex aeolicus 
GTP-binding protein; similar 
to AE000771 <PID;g2984292) 


662 


98 


1067 


Y18930 


Sulfolobus 
solfataricus 


hypothetical protein 


162 


29 


1068 


R65969 


Homo 

sapiens T98G 


Glioblastoma -derived 
polypeptide . 


887 


100 


1069 


Y07964 


Homo sapiens 


Human secreted protein 
fragment 


663 


96 


1070 


AF177476 


Rattus 
norvegicus 


CDX5 activator-binding 
protein 


1995 


86 


1071 | 


AF245505 


Homo sapiens 


adlican 


3109 


99 


1072 


U92794 


Mus musculus 


alpha glucosidase II, beta 
subunit 


14 7 


36 


1073 


G03889 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7970. 


698 


98 


1074 


U15779 . 


Homo sapiens 


p70 


380 


28 


1075 


Y1^92 


Homo sapiens 


Amino acid sequence of 


1271 


91 
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protein PR0328. 






1076 


AF161457 " 


Homo sapiens 


HSPC339 


571 


100 


1077 


Y79509 


Homo sapiens 


Human carbohydrate- associated 
protein CRBAP-5. 


2151 


98 


1078 


AF223466 


Homo sapiens 


HT015 protein 


831 


66 


1079 


AL132965 


Arab i daps is 
thaliana 


putative WD-40 repeat-protein 


286 


29 


1080 


AB024937 


Homo sapiens 


LUNX 


1284 


100 


1081 


Y14768 


Homo sapiens 


V-ATPase G-subunit like 
protein 


579 


100 


1082 


AF016416 


Caenorhabdit 
is elegans 


F29A7.4 gene product 


141 


31 


1083 


L13291 


Homo sapiens 


ADP-ribosylarginine hydrolase 


802 


45 


1084 


AB041541 


Mus musculus 


unnamed protein product 


151 


44 


1085 


G01922 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6003 . 


202 


97 


1086 


AB030814 


Homo sapiens 


H-REV107 protein homolog 


833 


100 


1087 


AF15163B 


Homo sapiens 


phosphatidylcholine transfer 
protein 


1142 


100 


1088 


Y84432 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


2783 


100 


1089 


Y94867 


Homo 
sapiens 


Human protein clone HP10563. 


613 


100 


1090 


AK023982 


Homo sapiens 


unnamed protein product 


130 


49 


1091 


AB041586 


Mus musculus 


unnamed protein product 


1103 


81 


1092 


Y71277 


Homo sapiens 


Human Zlipo3 protein. 


606 


100 


1093 


U34973 


Mus musculus 


protein tyrosine phosphatase - 
like 


1131 


95 


1094 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR0828. 


522 


56 


1095 


Y87276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
SEQ ID NO: 53. 


1029 


99 


1096 


Y87276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
SEQ ID NO: 53. 


863 


98 


1097 


AF161455 


Homo sapiens 


HSPC33 7 


742 


98 


1098 


U80029 


Caenorhabdit 
is elegans 


similar to thioredoxin 


242 


39 


1099 


AJ005B66 


Homo sapiens 


Sqv-7-like protein 


1321 


99 


1100 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


1118 


99 


1101 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


891 


99 


1102 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


1016 


99 


1103 


AL110244 


Homo sapiens 


hypothetical protein 


299 


31 


1104 


AF242194 


Drosophila 
melanogaster 


brakeless-B 


147 


52 


1105 


AL031010 


Homo sapiens 


dJ422F24.1 (PUTATIVE novel 
protein similar to C. elegans 
C02C2.5) 


968 


100 


1106 


U2B016 


Mus musculus 


para th ion hydrolase 
(phosphodiesterase) -related 
protein 


1624 


87 


1107 


AJ278150 j 


Homo sapiens 


putative lipid kinase 


2207 . 


99 


1108 


G03733 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7814. 


495 


98 


1109 


AF217287 


Drosophila 
melanogaster 


G protein RhoBTB 


834 


54 


1110 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


941 


48 


1111 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


1331 


51 


1112 


AF176704 


Homo sapiens 


F-box protein FBX9 


2027 


99 


1113 


AF1B2076 


Homo 
sapiens 


glioma tumor suppressor 
candidate region protein 2 


2418 


100 


1114 


G04039 


Homo sapiens 


Human secreted protein, SEQ 


475 


96 
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SCORE 


% 

IDENTITY 








ID NO: 8120. 






1115 


AF229439 


Mus mus cuius 


zinc finger protein 289 


1697 


91 


1116 


L40357 


Homo sapiens 


thyroid receptor interactor 


509 


100 


1117 


L40357 


Homo sapiens 


thyroid receptor interactor 


404 


85 


1118 


A12155 


Homo sapiens 


Human X5L cDNA. 


1673 


100 


1119 


AL161542 


Arabidopsis 
thaliana 


isomerase like protein 


607 


53 


1120 


AL023754 


Homo sapiens 


dJ2 72L16.1 (Rat 

Ca2+/ Calmodulin dependent 

Protein Kinase LIKE protein) 


2341 


98 


1121 


Y57901 


Homo sapiens 


Human transmembrane protein 
HTMPN-25. 


321 


36 


1122 


Z14122 


Xenopus 
laevis 


XLCL2 


455 


77 


1123 


AF225418 


Homo sapiens 


lipase 


1531 


97 


1124 


Y06518 


Homo sapiens 


Zen GTPase interacting 
protein ZIP. 


3227 


100 


1125 


AL035690 


Homo sapiens 


dJ202I21.1 (novel protein) 


952 


100 


1126 


AJ000217 


Homo sapiens 


CLIC2 


1286 


99 


1127 


AB030505 


Mus mus cuius 


UBE-1C2 


1069 


79 


1128 


Y73375 


Homo sapiens 


HTRM clone 142783 8 protein 
sequence . 


B74 


100 


1129 


Y78941 


Homo sapiens 


Cyclophilin- type peptidyl 
prolyl cis/ trans isomerase 
amino acid sequence. 


877 


100 


1130 


AL023553 


Homo sapiens 


dJ347H13.4 (novel protein) 


557 


100 


1131 


Y91945 


Homo sapiens 


Human chaperone protein 6 
(HCHP-6) . 


1406 


100 


1132 


Z68197 


Schizosaccha 

romyces 

pombe 


putative nuclear pore protein 


596 


39 


1133 


Z68197 


Schizosaccha 

romyces 

pombe 


putative nuclear pore protein 


389 


35 


1134 


AF180681 


Homo sapiens 


guanine nucleotide exchange 
factor 


3597 


100 


1135 


AF079765 


Mus mus cuius 


enhancer of polycomb 


264 


41 


1136 


M62419 


Mus mus cuius 


clathrin-associated protein 


2189 


99 


1137 


AJ006219 


Drosophila 
melanogaster 


clathrin-associated protein 


1254 


78 


1138 


Y7621B 


Homo sapiens 


Human secreted protein 
encoded by gene 95. 


440 


98 


1139 


W88104 


Homo 
sapiens 


A Rab protein designated 
HRABS-2. 


1065 


99 


1140 


Y13401 


Homo sapiens 


Amino acid sequence of 
protein PR0339. 


3979 


98 


1141 


W85026 


Chimeric - 
Homo sapiens 


Green fluorescent protein- 
Zap70 fusion product . 


3309 


100 


1142 


V13402 


Homo sapiens 


Amino acid sequence of 
protein PRO310 . 


1694 


99 


1143 


G03875 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7956. 


660 


99 


1144 I 


Y12917 


Homo sapiens 


Amino acid sequence o£ a 
human secreted peptide. 


750 


98 


1145 


Y12917 


Homo sapiens 


Amino acid sequence o£ a 
human secreted peptide. 


1096 


100 


1146 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXF34) ) 


1233 


100 


1147 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXF34) ) 


1233 


100 


1148 


G02548 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6629. 


370 


98 


1149 


Y73338 


Homo sapiens 


HTRM clone 2019742 protein 
sequence . 


1492 


100 


1150 


W74841 


Homo sapiens 


Human secreted protein 
encoded by gene 113 clone 


228 


55 
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IDENTITY 








HEAAR60. 






1151 


AF044201 


Rattus 
norvegicus 


neural membrane protein 35; 
NMP35 


1570 


92 


1152 


AF156774 


Homo 
sapiens 


lysophosphaticlic acid 
acyl t r ans f er a s e - gamma 1 


1855 


99 


1153 


AL118501 


Homo sapiens 


&J1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em:AL050069) ) 


872 


64 


1154 


AF131852 


Homo sapiens 


Unknown 


473 


100 


1155 


Y41705 


Homo 
sapiens 


Human PR0352 protein 
sequence. 


1381 


97 


1156 


G04036 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8117. 


607 


99 


1157 


AF112444 


Lupinus 
luteus 


L-asparaginase 


287 


43 


1158 


AF151848 


Homo sapiens 


CGI-90 protein 


232 


32 


1159 


AJ272267 


Homo sapiens 


choline dehydrogenase 


2449 


100 


1160 


AB001773 


Ciona 
savignyi 


PEM-6 


196 


33 


1161 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


746 


B3 


1162 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


746 


83 


11*3 


AF113534 


Homo sapiens 


HP1-BP74 protein 


2723 


96 


1164 


AF232226 


Danio rerio 


Deddl 


191 


41 


1165 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em: AL050069) ) 


1051 


71 


1166 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em: AL050069) ) 


945 


76 


1167 


AF187733 


Homo sapiens 


syntaphilin 


831 


42 


1168 


AB019435 


Homo sapiens 


phospholipase 


951 


55 


1169 


AF064604 


Homo sapiens 


KE03 protein 


324 


33 


1170 


Y01164 


Homo sapiens 


Polypeptide fragment encoded 
by gene 6 . 


1191 


100 


1171 


L03188 


Saccharomyce 
s cerevisiae 


putative 


180 


22 


1172 


AF113751 


Mus musculus 


nuclear pore membrane 
glycoprotein POM210 


3941 


81 


1173 


AJ245417 


Homo sapiens 


G5b protein 


794 


100 


1174 


AL022238 


Homo sapiens 


dJ1042K10.3 (novel protein) 


1285 


100 


1175 


U41278 


Caenorhabdit 
is elegans 


F33G12.3 gene product 


332 


28 


1176 


M35617 


Homo sapiens 


T-cell receptor V-alpha-J- 
alpha region 


284 


83 


1177 


AC012680 


Arabidopsis 
thai i ana 


putative protein phosphatase 
2C; 55455-56414 


209 


37 


1178 


G01345 


Homo sapiens 


Human secreted protein, SSQ 
ID NO: 5426. 


692 


99 


1179 


AL096767 


Homo sapiens 


dJ579N16.3 (novel protein 
similar to worm, Arabidopsis 
and pine proteins) 


1342 


100 


1180 


AF039716 


Caenorhabdit 
is elegans 


similar to ATP synthase B 
chain 


496 


55 


1181 


Y11710 


Homo sapiens 


collagen type XIV 


1048 


97 


1182 


X82240 


Homo 
sapiens] 
>R94 974 
R94974 09- 
MAY-1996 27- 
OCT-1994 
Human TCL-1 
polypeptide . 


T cell leukemia/lymphoma 1 


617 


100 
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[Homo 
sapiens 








1183 


U42841 


Caenorhabdit 
is elegans 


short region of weak 
similarity to collagen 


161 


33 


1185 


AJX31613 


Homo sapiens 


dicarboxylate carrier protein 


1470 


99. 


1186 


L27645 


Danio rerio 


growth-associated protein 


130 


36 


1187 


Y02738 


Homo sapiens 


Human secreted protein 
encoded by gene 89 clone 
HLHFP03 . 


636 


100 


1188 


AF217544 


Xenopus 
laevis 


ornithine decarboxylase- 2 


1459 


60 


1189 


AL136307 


Homo sapiens 


dJ380B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


182 


33 


1190 


X89602 


Homo sapiens 


rTSbeta 


197 


100 


1191 


U32828 


Haemophilus 

influenzae 

Rd 


ribosomal protein S6 
modification protein (rimK) 


268 


31 


1192 


AF154831 


Rattus 
norvegicus 


PV-1 


1403 


60 


1193 


Y50926 


Homo sapiens 


Human fetal brain cDNA clone 
vcl6_l derived protein. 


918 


100 


1194 


AF026530 


Rattus 
norvegicus 


stathmin-like-protein splice 
variant RB3 • • 


1093 


97 


1195 


U35244 


Rattus 
norvegicus 


vacuolar protein sorting 
homolog r-vps33a 


2981 


9(5 


1196 


Y70470 


Homo sapiens 


Human p53 target molecule, 
PRG3 protein. 


1680 


100 


1197 


AF157318 


Homo sapiens 


AD-017 protein 


912 


47 


1198 


AF125443 


Caenorhabdit 
is elegans 


contains similarity to S. 
pombe phosphatidyl synthase 
(GB:Z28295) 


460 


39 


1199 


AF201934 


Homo sapiens 


DC12 


1649 


88 


1200 


AL031775 


Homo sapiens 


dJ30M3.3 (novel protein 
similar to C. elegans 
Y63D3A.4) 


1902 


100 


1201 


M21103 


Ovis aries 


BIIIB4 high- sulfur keratin 


484 


82 


1202 


Z85986 


Homo sapiens 


dJ108K11.3 (similar to yeast 
suppressor protein SRP40) 


1143 


75 


1203 


U18762 


Rattus 
norvegicus 


retinol dehydrogenase type I 


890 


52 


1204 


U35730 


Mus musculus 


jerky 


2235 


76 


1205 


AB002327 


Homo sapiens 


KIAA0329 


151 


24 


1206 


AB019233 


Arabidopsis 
thaliana 


ubiquinone/menaquinone 

biosynthesis 

methyl transferase -like 


762 


56 


1207 


AL136307 


Homo sapiens 


dJ380B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


742 


100 


1208 


AF207989 


Homo sapiens 


orphan G-protein coupled 
receptor 


2326 


100 


1209 


Z97630 


Homo sapiens 


dJ466N1.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G) ) ) 


181 


44 


1210 


U21549 


Mus musculus 


Ac39/physophilin 


1280 


68 


1211 


Y27700 


Homo sapiens 


Human secreted protein 
encoded by gene No. 12. 


1267 


100 


1212 


AF117814- 


Mus musculus 


odd-skipped related 1 protein 


945 


66 


1213 


AF277233 


Naegleria 
fowleri 


calcineurin B 


222 


39 


1214 


D14849 


Mus musculus 


meiosis-specif ic nuclear 
structural protein l 


1950 


77 


1215 


G03022 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7103. 


590 


100 


121* 


Z72510 


Caenorhabdit 


similarity to yeast UTR3 


634 


49 
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SCORE 


IDENTITY 






is elegans 


protein (Swiss Prot accession 
yk677hll.5 comes from this 
gene 






1217 


Z49703 


Saccharorayce 
s cerevisiae 


unknown 


134 


22 


1218 


AC0134 3 0 


Arabidopsis 
thaliana 


F3F9.18 " - 


199 


29 


1219 


L10910 


Homo sapiens 


splicing factor 


1026 


71 


1220 


Z70750 


Caenorhabdit 
is elegans 


similar to vanadate 
resistance protein 
transmembranous comes from 
this gene 


965 


58 


1221 


AL163815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


1222 


AF155100 


Homo sapiens 


zinc finger protein NY-REN-21 
antigen 


2261 


100 


1223 


J05071 


Bos taurus 


GTP-binding regulatory 
protein gamma- 6 subunit 


3S6 


100 


1224 


Y73364 


Homo sapiens 


HTRM clone 2765991 protein 
sequence . 


1169 


99 


1225 


AL050170 


Homo sapiens 


hypothetical protein 


714 


100 


1226 


X64002 


Homo sapiens 


RAP74 


2661 


99 


1227 


X04085 


Homo sapiens 


catalase 


2346 


100 


1228 


AJOOS620 


Mus mus cuius 


skeletal muscle-specific gene 


1416 


90 


1229 


AF045564 


Rattus 
norvegicus 


development -related protein 


1715 


93 


1230 


X97571 


Mus musculus 


HCMV- interacting protein 


479 


96 


1231 


L08239 


Homo sapiens 


located at 0ATL1 


2274 


100 


1232 


AF121863 


Homo sapiens 


sorting nexin 14 


1964 


100 


1233 


AF121863 


Homo sapiens 


sorting nexin 14 


1203 


84 


1234 


AC024805 


Caenorhabdit 
is elegans 


contains similarity to 
TR:O04595 


744 


31 


1235 


AC00£634 


Caenorhabdit 
is elegans 


contains similarity to 
Saccharomyces cerevisiae 
probable membrane protein 
YLR418C (GB:U20162) 


357 


33 


1236 


Y18101 


Mus musculus 


macrophage actin-associated- 

tyrosine-phosphorylated 

protein 


1559 


87 


1237 


AB042646 


Homo sapiens 


TGIF2 


1224 


100 


1238 


AB026264 


Homo sapiens 


IMPACT 


1694 


100 


1239 


AB026264 


Homo sapiens 


IMPACT 


1123 


100 


1240 


G00429 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 4510. 


324 


• 100 


1241 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1363 


53 


1242 


AL035602 j 


Arabidopsis 
thaliana 


putative protein 


499 


28 


1243 


X76483 


Gallus 
gallus 


Yes-associated protein 
(65kDa) 


574 


48 


1244 


AF220186 


Homo sapiens 


uncharacterized hypothalamus 
protein HT012 


503 


100 


1245 


AL021453 


Homo sapiens 


dJ821D11.3 (PUTATIVE protein) 


656 


100 


1246 


AJ276003 


Homo sapiens 


GAR1 protein 


1216 


100 


1247 


Y57910 


Homo sapiens 


Human transmembrane protein 
HTMPN-34. 


1369 


98 


1248 


AC004874 


Homo sapiens 


similar to N- 

acetylgalactosaminyltransfera 
se; similar to Q07537 
(PID:gll71989) 


957 


100 


1249 


AF199597 


Homo 
sapiens 


A- type potassium channel 
modulatory protein l 


1139 


100 


1250 


Y1314B 


Rattus 
norvegicus 


PAG608 


1350 


88 


1251 


M24852 


Rattus 
norvegicus 


neuron- specif ic protein PEP- 

19 


124 


46 
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IDENTITY 


1252 


AF146738 


Rattus 
norvegicus 


testis specific protein 


771 


83 


1253 


G02725 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6806. 


419 


97 


1254 


W44375 


Homo sapiens 


Human ubiqui tin- conjugating 
enzyme polypeptide. 


1045 


99 


1255 


AC006538 


Homo sapiens 


BC41195_1 


831 


78 


1256 


AB004316 


Bos taurus 


mitochondrial methionyl- tRNA 
trans formylase 


1556 


88 


1257 


Z35094 


Homo sapiens 


SURF- 2 


1354 


97 


1258 


Y13362 


Homo sapiens 


Amino acid sequence of 
protein PR0214. 


2383 


100 


1259 


AC005014 


Homo sapiens 


similar to RFP transforming 
protein; similar to P14373 
(PID:gl32517) 


1299 


100 


1260 


AC005099 


Homo sapiens 


match to AI222572 
(NID:g3804775) 


469 


100 


1261 


V00507 


Homo sapiens 


coding sequence of DHFR (1 is 
1st base in codon) (561 is 
3rd base in codon) 


984 


100 


1262 


X15443 


Rattus sp . 


gamma -glutamyl transpeptidase 
(AA 1-568) 


697 


32 


1263 


AF173871 


Mus musculus 


neuronal PAS 3 


977 


■ 94 


1264 


AF178983 


Homo sapiens 


Ras-aseociated protein Rapl 


433 


97 


1265 


Y70473 


Homo sapiens 


Human cyclic nucleotide - 
associated protein- 1 (CNAP- 
1) . 


2785 


99 


1266 


Y41738 


Homo 
sapiens 


Human PR0541 protein 
sequence . 


1622 


10 0 


1267 


AF061346 


Mus mueculus 


Edpl protein 


1077 


64 


1268 


U97006 


Caenorhabdit 
is elegans 


C13F10.4 gene product 


154 


23 | 


1269 


AF233582 


Mus musculus 


GTPase Rab3 7 


942 


95 


1270 


AF195951 


Homo sapiens 


signal recognition particle 
68 


3127 


98 | 


1271 


AL031177 


Homo sapiens 


dJ889M15.3 (novel protein) 


1150 


55 


1272 


AF201933 


Homo sapiens 


DC11 


650 


100 


1273 


AF201933 


Homo sapiens 


DC11 


346 


98 


1274 


AL021710 


Arabidopsis 
thaliana 


putative protein 


348 


49 


1275 


AC004449 


Homo sapiens 


R33683_3 


556 | 


100 


1276 


Y86295 


Homo sapiens 


Human secreted protein 
HL2AG87, SEQ ID NO: 210 . 


1920 


100 


1277 


Y71111 


Homo sapiens 


Human Hydrolase protein- 9 
(HYDRL-9). 


1576 


99 


1278 


S94421 


Homo sapiens 


T cell receptor eta-exon 


478 


100 


1279 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344. 


1909 


100 


1280 • 


AF161380 


Homo sapiens 


HSPC262 


772 


100 


1281 


Y4B610 


Homo sapiens 


Human breast tumour- 
associated protein 71. 


779 


100 


1282 


AC015446 


Arabidopsis 
thaliana 


Similar to AIG1 protein 


406 


35 


1283 


AK024432 


Homo sapiens 


FLJ00022 protein 


403 


35 


1284 


W96153 


Homo sapiens 


Human FADD- interacting 
protein (FIP) . 


1825 


81 






Homo sapiens 


ring finger protein 


1301 


100 


1286 


AE003823 


Drosophila 
melanogaster 


CG13178 gene product 


195 


29 


1287 


AF178632 


Homo sapiens 


FEM-l-like death receptor 
binding protein 


3261 


100 


1288 


AC006033 


Homo 
sapiens 


similar to MLN 64; similar to 
138027 (PID:g2135214) 


1195 


100 


1289 


ACD06D33 


Homo 
sapiens 


similar to MLN 64; similar to 
138027 (PID:g2135214) 


668 


93 


1290 


AB023811 


Homo sapiens 


TU3A 


351 


54 
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WATERMAN 
SCORE 


H 

IDENTITY 


1291 


Z73424 


Caenorhabdit 
is elegans 


C44B9.1 


235 


36 


1292 


Y94B71 


Homo 
sapiens 


Human protein clone HP02551. 


1222 


100 


1293 


AF190425 


Homo sapiens 


retinoblastoma- associated 
protein RAP140 


489 


29 


1294 


G03856 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7937. 


538 


99 


1295 


AF133670 


Mus mus cuius 


ARL-6 interacting protein- 2 


367 


51 


1296 


AJ249735 


Homo sapiens 


claudin-6 


1142 


100 


1297 


X57560 


Escherichia 
coli 


pspE protein 


£35 


100 


1298 


AF169284 


Homo sapiens 


LIM and cysteine-rich domains 
protein 1 


1997 


100 


1299 


U41023 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
yk61fl.3; coded for by C. 
yk!09h8.5 


324 


29 


1300 


AB024523 


Homo sapiens 


basic kruppel like factor 


1206 


100 


1301 


X559B9 


Homo sapiens 


eosinophil cat ionic- related 
protein 


737 


99 


1302 


AF007151 


Homo sapiens 


unknown 


1481 


100 


1303 


X52904 


■Escherichia 
coli 


open reading frame (AA 1-65) 


359 


100 


1304 


U19577 


Escherichia 
coli 


galactonate dehydratase 


242 


93 


1305 


AF266508 


Mus musculus 


NELF protein 


1409 


97 


1306 


Y57901 


Homo sapiens 


Human transmembrane protein 
HTMPN-25. 


932 


100 


1307 


U58750 


Caenorhabdit 
is elegans 


similar to the mitochondrial 
carrier family 


365 


54 


130B 


AF044774 


Homo sapiens 


breakpoint cluster region 
protein 2 


2681 


99 


1309 


AL078593 


Homo sapiens 


dJ210Bl.l (KIAA0680) 


267 


34 


1310 


X82693 


Homo sapiens 


E48 antigen 


620 


96 


1311 


Z822*3 


Caenorhabdit 
is elegans 


C47A4 .1 


283 


35 


1312 


AF131218 


Homo sapiens 


chromosome 16 open reading 
frame 5 


1493 


100 


1313 


Y41763 


Homo 
sapiens 


Human PR0938 protein 
sequence . 


1636 


100 


1314 


AF196972 


Homo sapiens 


JM24 protein 


2239 


100 | 


1315 


AF053356 


Homo sapiens 


insulin receptor substrate 
like protein 


228 


97 


1316 


Y66695 


Homo 
sapiens 


Membrane -bound protein j 
PR01344 . 


1909 


100 


1317 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


2442 


89 


1318 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1477 


83 


1319 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1651 


86 


1320 


X56932 


Homo sapiens 


23 kD highly basic protein 


1044 


100 


1321 


AF174605 


Homo 
sapiens] 
>Y83086 
Y83086 09- 
MAR-2000 28- 
AUG-1998 F- 
box protein 
FBP-18. 
[Homo 
sapiens 


F-box protein Fbx25 


467 


70 


1322 


M61732 


Trypanosoma 
cruzi 


neuraminidase 


214 


24 


1323 


Y17013 


porcine 
endogenous 


pol 


304 


64 
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retrovirus 








1324 


AL138G55 


Arabidopsis 
thai i ana 


putative protein 


1174 


37 


1325 


AL13 8655 


Arabidopsis 
thaliana 


putative protein 


946 


35 


132* 


AL133215 


Homo sapiens 


bA108L7.2 {novel protein 
similar to rat tricarboxylate 
carrier) 


1322 


99 ?■ 


1327 


AF161541 


Homo sapiens 


HSPC056 


1357 


99 


1328 


Y73346 


Homo sapiens 


HTRM clone 619699 protein 
sequence . 


785 


96 


1329 


L10910 


Homo sapiens 


splicing factor 


912 


82 


1330 


AF146568 


Homo sapiens 


MIL1 protein 


1936 


100 


1331 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


232 


39 


1332 


Y41741 


Homo 
sapiens 


Human PRO704 protein 
sequence . 


1860 


100 


1333 


AF295096 


Homo sapiens 


zinc-finger protein ZBRKl 


411 


91 


1334 


Z82271 


Caenorhab d i t 
is elegans 


Similarity to Mouse kinensin- 
like protein KIF4 comes from 
this gene 


578 


44 


1335 


AE000810 


Methanobacte 
rium 

t he rmoau t o t r 
ophicum 


conserved protein 


290 


43 


1336 


Y68779 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-11. 


1019 


91 


1337 


AB027003 


Mus musculus 


protein phosphatase 


378 


84 


1338 


U64856 


Caenorhabdit 
is elegans 


weak similarity to TPR 
domains 


215 


40 


1339 


AE001394 


Plasmodium 
falciparum 


protein of the YMR7 family 


170 


29 


1340 


X76717 


Homo sapiens 


MT-ll protein 


204 


89 


1341 


AC011914 


Arabidopsis 
thaliana 


putative mutT protein; 68398- 
67881 


289 


45 


1342 


AJ276171 


Homo sapiens 


ASPIC 


2122 


100 


1343 


AF187016 


Homo sapiens 


myosin regulatory light chain 
interacting protein MIR 


2303 


99 


1344 


AC006963 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 
(PID:g4650844) 


894 


35 


1345 


AF2574££ 


Homo sapiens 


N-acetylneurarainic acid 
phosphate synthase 


1880 


99 


1346 


Y2S896 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
64. 


1148 


100 


1347 


AJ272073 


Torpedo 
marmorata 


male sterility protein 2 -like 
protein 


1664 


58 


1348 


AF161548 


Homo sapiens 


HSPC063 


1018 


98 


1349 


W7812B 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96 . 


1117 


100 


1351 


602144 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6225. 


418 


100 


1352 


D90869 


Escherichia 
coli 


similar to 


2047 


100 


1353 


A12029 


Homo sapiens 


MRP- 14 


613 


100 


1354 


AC005328 


Homo sapiens 


R26660_l, partial CDS 


870 


74 


1355 


AC024876 


Caenorhabdit 
is elegans 


contains similarity to 
SW:RPB1_CRIGR 


829 


61 


1356 


AF077226 


Homo sapiens 


copine III 


1876 


"<£4 


1359 


AF217188 


Mus musculus 


YIPlB 


801 | 


63 


13*0 


AC074331 


Homo sapiens 


ZNF234 


3869 


100 | 


1361 


AL163279 


Homo sapiens 


homolog to cAMP response 


5035 


99 ! 
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element binding and beta 
transducin family proteins 






1362 


Z48475 


Homo sapiens 


glucokinase regulator 


3160 


99 


1363 


Z48475 


Homo sapiens 


glucokinase regulator 


2682 


97 


1364 


AP195764 


Homo sapiens 


megakaryocyte- enhanced gene 
transcript 1 protein; MEGT1 
protein 


2055 


99 


1365 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


1366 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


1367 


AL117352 


Homo sapiens 


dJ876B10.3 (novel protein 
similar to C, elegans 
T19B10.6 (Tr:Q22557) ) 


2581 


99 


1368 


Y34124 


Homo 
sapiens 


Human potassium channel 
K+Hnovl5 . 


1342 


100 


1369 


AJ245621 


Homo sapiens 


CTL2 protein 


3728 


99 


1370 


AF008220 


Bacillus 
subtilis 


YtaG 


429 


45 


1371 


X05562 


Homo sapiens 


alpha- 2 chain precursor (AA - 
25 to 1018) (3416 is 2nd base 
in codon) 


5908 


99 


1372 


Z98048 


Homo sapiens 


dJ408N23.4 (novel DnaJ domain 
protein) 


1296 


99 


1373 


AF154415 


Homo sapiens 


FLASH 


10253 


100 


1374 


U20286 


Rattus 
norvegicus 


lamina associated polypeptide 
1C 


1567 


69 


1375 


U53445 


Homo sapiens 


D0C1 


1645 


46 


137* 


AL117337 


Homo 
sapiens 


bA393J16.1 (zinc finger 
protein 33a (KOX 31)) 


250 


60 


1377 


AC005328 


Homo sapiens 


R26660_l, partial CDS 


1126 


100 


1378 


U35113 


Homo sapiens 


metastasis-associated gene 


1823 


69 


1379 


L15313 


Caenorhabdit 
is elegans 


putative 


858 


58 


1380 


Y257S6 


Homo sapiens 


Human secreted protein 
encoded from gene 46. 


1508 


100 


1381 


AB037360 


Homo sapiens 


ANKHZN 


5734 


95 


1382 


AB037360 


Homo sapiens 


ANKHZN 


959 


97 


1383 


AF237676 


Mus mus cuius 


G beta-like protein GBL 


1721 


96 


1384 


AF237676 


Mus mus cuius 


G beta- like protein GBL 


1043 


70 


1385 


Y58793 


Homo sapiens 


Human calcium regulatory- 
protein CaREG- 1 . 


715 


100 


1386 


AF212162 


Homo sapiens 


ninem 


10369 


99 


1387 


AL031685 ! 


Homo sapiens 


dJ963K23.2 (novel protein) 


337 


33 


1388 


AC004890 


Homo sapiens 


similar to zinc finger 
proteins; similar to BAA243 80 
>W06316 W06316 03-OCT-1996 
27-APR-1995 TRP-1 protein. 


542 


86 


1389 


AF187989 


Homo sapiens 


zinc finger protein ZNF223 


2665 


99 


1390 • 


AC035150 


Homo sapiens 


Zinc finger protein ZNF221 


3459 


100 


1391 


AF287894 


Homo sapiens 


PIST 


1410 


97 


1392 


AF282265 


Homo sapiens 


inner centromere protein 
INCENP 


1794 


99 


1393 


X90840 


Homo sapiens 


axonal transporter of 
synaptic vesicles 


4584 


99 


1394 


AF076249 


Homo sapiens 


zinc finger protein SBBIZ1 


3208 


99 | 


1395 


G02224 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 63 05. 


299 


75 


1396 


AC004809 


Arabidopsis 
thaliana 


Similar to 


130 


34 


1398 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


66 ; 


1399 


AL133396 


Homo 
sapiens 


dJ1068H6.4 (prion protein 
like protein doppel) 


962 


100 


1400 


Y48611 


Homo sapiens 


Human breast tumour- 
associated protein 72. 


817 


99 


1401 


AC004472 


Homo sapiens 


P1.11659_5 


280 | 


54 


1402 


X91489 


Saccharomyce 
s cerevisiae 


putative HMG box 


164 


27 
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1403 


Y79222 


Homo 
sapiens 


Human transferase TRNSFS-14. 


2842 


100 


1404 


X81058 


Mug musculus 


tex261 


1010 


99 


1405 


AB012084 


Mus musculus 


ITM 


194 


29 


1406 


AB030251 


Homo sapiens 


GTPase activating protein 


3233 


99 


1407 


AJ010585 


Rattus 
rattus 


PTB-like protein 


2684 


99 


1408 


X75760 


Drosophila 
melanogaster 


LRR4 7 


3 64 


29 


1409 


U76618 


Mus musculus 


N-RAP 


804 


48 


1410 


AC005578 


Homo sapiens 


F20B87_1, partial CDS 


835 


63 


1411 


AE000284 


Escherichia 
coli 


orf, hypothetical protein- 


360 


100 


1412 


X01563 


Escherichia 
coli 


ic (rplE) (aa 1-179) 


911 


100 


1413 


W7B279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


1414 


AB031051 


Homo sapiens 


organic anion transporter 
OATP-E 


3832 


100 


1415 


M17466 


Homo sapiens 


coagulation factor XII 


3455 


100 


1416 


AF097994 


Homo 
sapiens 


L-kynurenine /alpha - 
aminoadipate aminotransferase 


2202 


99 


1417 


AF151077 


Homo sapiens 


HSPC243 


1262 


99 


1418 


Y09945 


Rattus 
norvegicus 


putative integral membrane 
transport protein 


1098 


61 


1419 


U13152 


Mesocricetus 
auratus 


guanine nucleotide -binding 
protein beta 5 


2179 


76 


1420 


AL162458 


Homo sapiens 


bA465L10.5 (KIAA1176 (novel 
protein, presumed ortholog 
of mouse K-Cl cotransporter 
KCC2) ) 


5696 


100 


1421 


Y99426 


Homo sapiens 


Human PRO1604 (UNQ785) amino 
acid sequence SEQ ID NO: 3 08. 


152 


29 


1422 


Y94923 


Homo sapiens 


Human secreted protein clone 
qs!4_3 protein sequence SEQ 
ID NO: 52. 


4039 


99 


1423 


AF177388 


Homo 
sapiens 


cancer-amplified 
transcriptional coactivator 
ASC-2 


10748 


99 


1424 


Y4 8517 


Homo sapiens 


Human breast tumour - 
associated protein 62. 


1851 


99 


1425 


AF208848 


Homo sapiens 


BM-006 


1454 


89 


1426 


AF208848 


Homo sapiens 


BM-006 


853 


79 


1427 


AF112886 


Bos taurus 


differentiation enhancing 
factor 1 


4693 


95 


1428 


U41387 


Homo sapiens 


Gu protein 


1372 


63 


1429 


AF161534 


Homo sapiens 


HSPC049 


2853 


78 


1430 


AF125043 


Mus musculus 


bisphosphate 3 ' -nucleotidase 


275 


30 


1431 


Y66718 


Homo 
sapiens 


Membrane -bound protein 
PRO1106. 


1886 


100 


1432 


AF193613 


Homo sapiens 


cell recognition molecule 
Caspr2 


568 


100 


1433 


AB044560 


Mus musculus 


Gliacolin 


192 


34 


1434 


R99800 


Homo sapiens 


NTII-1 nerve protein, 
facilitates regeneration of 
nerve cells. 


707 


51 


1435 


AF220530 


Homo sapiens 


myo- inositol 1 -phosphate 
synthase Al 


2904 


100 


1436 


X70944 


Homo sapiens 


PTB- associated splicing 
factor 


1261 


72 


1437 


AF271732 


Homo sapiens 


bridging integrator- 3 


1282 


100 


1438 


Y30811 


Homo sapiens 


Human secreted protein 
encoded from gene 1. 


595 


98 


1439 


AJ293659 


Homo sapiens 


mucolipidin 


628 


97 


1440 


AF219138 


Homo sapiens 


GGA3 long isoform 


3083 


100 


1441 


AF219138 


Homo sapiens 


GGA3 long isoform 


3346 


100 
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1442 


AB039669 


Homo sapiens 


ALEX3 


1944 


100 


1443 


AF237711 


Drosophila 
melanogaster 


Diablo 


191 


27 


1444 


AJ011896 


Homo sapiens 


Nafl beta protein 


439 


39 


1445 


X73874 


Homo sapiens 


phosphorylase kinase 


6233 


98 


1446 


AF214114 


Homo sapiens 


breast carcinoma-associated 
antigen BCAA 


3999 


99 


1447 


AF003924 


Homo sapiens 


ANC_2H01 


2645 


99 


1448 


AF003136 


Caenorhabdit 
is elegans 


contains weak similarity to 
an AMP-binding motif 


2843 


52 


1449 


AF155112 


Homo sapiens 


NY-REN-50 antigen 


1184 


89 


1450 


Y95004 


Homo sapiens 


Human secreted protein 
vc54_l, SEQ ID NO: 48. 


985 


100 


1451 


AF1072 03 


Homo sapiens 


ataxin 2-binding protein 


688 


57 


1452 


AF107203 


Homo sapiens 


ataxin 2-binding protein 


456 


78 


1453 


Z38011 


Mus musculus 


DMR-N9 


882 


56 


1454 


X90568 


Homo sapiens 


Protein sequence and 
annotation available soon via 
LABEIT@EMBL-Heidelberg .DE 


510 


28 


1455 


AL035409 


Homo sapiens 


dJ564M11.3 (similar to 
sialyl tranf erase ) 


1356 


100 


1456 


D44480 


Mus musculus 


MATH- 2 protein 


272 


100 


1458 


AF141326 


Homo sapiens 


RNA helicase HDB/DICE1 


478 


45 


1459 


AF242552 


Gallus 
gallus 


retinovin 


945 


34 


1460 


U11036 


Homo sapiens 


Ibdl 


724 


84 


1461 


AB025258 


Mus musculus 


granuphilin-a 


545 


39 


1462 


Y08134 


Homo sapiens 


acid sphingomyelinase- like 
phosphodiesterase 


2428 


99 


1463 


AC004997 


Homo sapiens 


match to ESTs 243979 
(NID:g573097) , R19699 
(NID:g774333) 


869 


98 


1464 


AC004997 


Homo sapiens 


match to ESTs Z43979 
(NID:g573097) , R19699 
<NID:g774333) 


869 


98 


1465 


U32743 


Haemophi lus 
influenzae . 
Rd 


fucose operon protein (fucU) 


315 


50 


1466 


Y09022 


Homo sapiens 


Not56-like protein 


2342 


ioo s 


1467 


AC003034 


Homo sapiens 


Homolog of rat kidney- 
specific (KS) gene 


1072 


99 


1468 


AF071544 


Spinacia 
oleracea 

| 


ribulose-1, 5-bisphosphate 
carboxylase /oxygenase small 
subunit N-methyl transferase I 


333 


26 


1469 


Y57930 


Homo sapiens 


Human transmembrane protein 
HTMPN-54. 


1053 


100 


1470 


AF032666 


Rattus 
norvegicus 


rsecS 


4504 


93 


1471 


Y70467 


Homo sapiens 


Human membrane channel 
protein-17 (MECHP-17) . 


452 


74 


1472 


AL031033 


Homo sapiens 


C321D2.1 (Ribosomal Large 
Subunit Pseudouridine 
Synthase protein) 


1694 


100 


1473 


AF177292 


Homo sapiens 


genethonin 3 


4026 


98 


1474 


S45936 ; 


Homo sapiens 


HTS1 


1101 


50 


1475 


Y86241 


Homo sapiens 


Human secreted protein 
HOABR60, SEQ ID NO: 156. 


1879 


98 


1476 


AJ010317 


Fugu 
rubripes 


Sand 


1278 


68 


1477 


U42B31 


Caenorhabdi t 
is elegans 


coded. for by C. elegans cDNA 
yk99b4.3; similar to human 
transforming protein 
(PIR:S22157) 


846 


44 


1478 


X62447 


Homo sapiens 


PR 264 


543 


61 


1479 


X82209 


Homo sapiens 


MN1 


7116^ 


100 


1480 


U10536 


Pan paniscus 


MHC. class I A 


675 


84 
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1481 


AL078599 


Homo sapiens 


dJ991C6.1 (novel protein 
similar to C. elegans 
F55A12.9 <Tr:P91086) ) 


1274 


65 


1482 


Z98977 


Schizosaccha 

romyces 

pombe 


putative vacuolar protein 


256 


29 


1483 


AB005662 


Mus musculus 


JNK/SAPK-associated protein- 1 


4968 


92 


1484 


AL050120 


Homo sapiens 


hypothetical protein 


716 


100 


1485 


M27878 


Homo sapiens 


DNA binding protein 


1006 


53 


1486 


Y69161 


Homo sapiens 


Amino acid sequence of a 
partial protein kinase. 


575 


99 


1487 


X841S6- 


Saccharomyce 
s cerevisiae 


ATH1 


341 


29 


1468 


AF038963 


Homo sapiens 


RNA helicase 


446 


34 


1489 


U56966 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
yk30b3.5; coded for by C. 
elegans cDNA yk30b3.3 


620 


42 


1490 


AE000989 


Archaeoglobu 
s fulgidus 


enoyl-CoA hydra tase {fad- 4) 


533 


46 


1491 


M80633 


Rattus 
norvegicus 


adenylyl cyclase type IV 


707 


95 


1492 


Y73342 


Homo sapiens 


HTRM clone 2709055 protein 
sequence . 


3513 


99 


1493 


Y17220 


Homo sapiens 


Human secreted protein (clone 
f j283-ll) . 


462 


37 


1494 


AF133670 


Mus musculus 


ARL-6 interacting protein-2 


701 


97 


1495 


Y94897 


Homo 
sapiens 


Human protein clone HP10574. 


1^71 


100 


1496 


AL049699 


Homo sapiens 


dJ747H23.2 (novel protein) 


1550 


100 


1497 


AF037447 


Homo sapiens 


ribosomal S6 protein kinase 


2427 


100 


1498 


AL445067 


Thermoplasma 
acidophilum 


putative target YPL207W of 
the HAP 2 transcriptional 
complex related protein 


269 


35 


1499 


AB039947 


Homo sapiens 


XllL-binding protein 51 


227 


36 


1500 


AJ277750 


Homo sapiens 


UBASH3A protein 


3509 


100 


1501 


AL050333 


Homo 
sapiens 


dJ93K22.1 (novel protein 
(contains DKFZP564B116) ) 


2439 


100 


1502 


AF179896 


Homo sapiens 


TALE homeobox protein Meis2b 


1140 


100 


1503 


AF178948 


Homo sapiens 


TALE homeobox protein Meis2a 


1177 


100 


1504 


Y53005 


Homo sapiens 


Human secreted protein clone 
pm749_8 protein sequence SEQ 
ID N0:16. 


1442 


99 


1505 


X82494 


Homo sapiens 


f ibulin-2 


3580 


99 


1506 


X98296 


Homo sapiens 


ubiquitin hydrolase 


783 


42 


1507 


AL034548 


Homo sapiens 


dJ1103G7.6 (novel protein) 


1098 


100 


1508 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1736 


100 


1509 


AF220182 


Homo sapiens 


uncharacterized hypothalamus 
protein HT008 


1181 


98 


1510 


U64601 


Caenorhabdit 
is elegans 


Gene probably begins in the 
next cosraid 


415 


58 


1511 


AL356192 


Neurospora 
crassa 


related to MDMl protein 


196 


29 


1512 


D17629 


Homo 
sapiens 


N-acetylgalactosamine 6- 
sulfate sulfatase (GALNS) 


1829 


100 


1513 


AF168717 


Homo sapiens 


x 009 protein 


694 


99 


1514 


AJ243531 


Homo sapiens 


nM15 protein 


735 


100 


1515 


AC003672 


Arabidopsis 
thai i ana 


putative C3HC4-type RING zinc 
finger protein 


407 


30 


151* 


AF115435 | 


Rattus 
norvegicus 


syntaxin 17 


1374 


90 


1517 


AF003140 


Caenorhabdit 
is elegans 


C44E4.5 gene product 


274 


31 


151B 


AB002584 


Rattus 
norvegicus 


beta -alanine -pyruvate 
aminotransferase 


2238 


82 


1519 


AL121764 


Schizosaccha 


yeaBt atpl2 protein precursor 


270 


30 
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romyces 
pombe 


homolog 






1520 


AF255910 


Homo 
sapiens 


vascular endothelial 
junction-associated molecule 


547 


100 


1521 


D31764 


Homo sapiens 


KIAA0064 


170 


27 


1522 


Y66634 


Homo 
sapiens 


Membrane-bound protein 
PRO190. 


98S 


100 


1523 


Y94456 


Homo sapiens 


Human inflammation associated 
protein 


250 


43 


1524 


AC0001O7 


Arabidopsis 
thaliana 


F17F8.22 


277 


37 


1525 


AF109377 


Mus mus cuius 


ldlBp 


1277 


83 


1526 


AL031427 


Homo sapiens 


dJ167A19.4 (novel protein) 


1432 


99 


1527 


Y08135 


Mus mus cuius 


acid sphingomyelinase- like 
phosphodiesterase 


1496 


79 


1528 


AK024423 


Homo sapiens 


FLJ0Q012 protein 


611 


100 


1529 


AF154502 


Homo sapiens 


quiescent cell proline 
dipeptidase 


679 


100 


1530 


AF205598 


Homo sapiens 


transposase-like protein 


1368 


100 


1531 


AF251039 


Homo sapiens 


putative zinc finger protein 


1420 


50 


1532 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


493 


57 


1533 


AF039023 


Homo sapiens 


Ran-GTP binding protein; 
RanBP6 


5707 


99 


1534 


AC007190 


Arabidopsis 
thaliana 


F23N19.9 


3 74 


37 


1535 


AB027564 


Homo sapiens 


DINB1 


4482 


100 


1536 


Y36178 


Homo sapiens 


Human secreted protein 


377 


87 


1537 


Y50907 


Homo sapiens 


Human fetal brain cDNA clone 
vb3_l derived protein. 


3 693 


99 


1536 


AF017368 


Mus musculus 


faciogenital dysplasia 
protein 2 


177 


47 


1539 


AF266756 


Homo sapiens 


sphingosine kinase | 


2011 


99 


1540 


Z48804 


Homo sapiens 


0A1 


2238 


100 


1541 


AF000195 


Caenorhabdi t 
is elegans 


Contains similarity to Pfam 
domain: PF00169 (PH) , 
Score=20.6, E-value=l . 9e-05, 
N=l 


379 


42 


1542 


Y71159 


Homo sapiens 


Human phosphodiesterase 
interacting protein, 
myomegalin. 


9415 


99 


1543 


X76092 


Homo sapiens 


DNA binding protein RFX3 


3327 


100 


1544 


AB01533O 


Homo sapiens 


HRIHFB2007 


631 


50 


1545 


AF198487 


Homo sapiens 


transcription factor LBP-lb 


2822 


100 


1546 


AF016417 


Caenorhabdi t 
is elegans 


Similar to BZIP transcription 
factor 


518 


42 


1547 


X55885 


Homo sapiens 


KDEL receptor 


1106 


100 


1548 


AB035495 


Carassius • 
auratus 


ubiquit in- activating enzyme 
El 


836 


42 


1549 


AL021707 


Homo sapiens 


dJ508I15.4 (KIAA0668) 


3688 


100 


1550 


AJ223978 


Bacillus 
subtilis 


YvqK protein 


292 


42 


1551 


AF145615 


Drosophila 
melanogaster 


BCDNA.GH03377 


822 ! 


44 


1552 


AL157734 


Schizosaccha 

romyces 

pombe 


putative mannosyl transferase 
involved in N-glycosylation 


435 


37 


1553 


AF079S27 


Mus musculus 


IER5 


691 


63 


1554 


AB026291 


Rattus 
norvegicus 


acetoacetyl-CoA synthetase 


1099 


88 


1555 


Y44722 


Homo sapiens 


Human immune system molecule, 

ISMO-3. 


1780 


99 


1556 


AF116553 


Drosophila 
melanogaster 


antennal -specif ic short -chain 
dehydrogenase /reductase 


277 


32 


1557 


Y71056 


Homo sapiens 


Human membrane transport 


1975 


99 
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protein, MTRP-1. 






1558 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-1. 


1975 


99 


1559 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-l. 


1894 


97 


1560 


AF092050 


Mus mus cuius 


beta-l,3-N- 

acetylglucosaminyl transferase 


262 


44 


1561 


AL109827 


Homo sapiens 


dJ309K20.2 (acrosomal protein 
ACR55 (similar to rat sperm 
antigen 4 (SPAG4) ) ) 


1607 


97 


1562 


AJ131890 


Homo sapiens 


DNA polymerase lambda 


3002 


100 


15*3 


AL035424 


Homo sapiens 


dA22D12.1 (novel protein 
similar to Drosophila Kelch 
proteins) 


3015 


100 


1564 


AC002400 


Homo sapiens 


Gene product with similarity 
to Ubiquitin binding enzyme 


2790 


100 


1565 


AC005306 


Homo sapiens 


R27216 1 


919 


82 


1566 


AF000195 


Caenorhabdit 
is elegans 


Contains similarity to Pfam 
domain: PF00169 (PH) , 
Score=20.6, E-value=l .9e-05, 
N=l 


550 


45 


1567 


AB033281 


Homo 
sapiens 


F-box and WD- repeats protein 
beta-TRCP2 isoform C 


2879 


100 


1568 


D49473 


Mus musculus 


truncated form of Soxl7 


1047 


78 


1569 


AK025270 


Homo sapiens 


unnamed protein product 


210 


91 


1570 


X75756 


Homo sapiens 


protein kinase C mu 


4797 


99 


1571 


AF145713 


Homo sapiens 


SCHIP-1 


5388 


100 


1572 


AE003831 


Drosophila 
melanogaster 


CG1844 5 gene product 


180 


31 


1573 


AF074603 


Streptomyc.es 
griseus 
subsp. 
griseus 


NonF 


205 


38 


1574 


U28993 


Caenorhabdit 
is elegans 


F22D3 . 3 gene product 


144 


27 


1575 


AF129507 


Homo sapiens 


transcription factor ICBP90 


287 


68 


1576 


X64878 


Homo sapiens 


oxytocin receptor 


2002 


100 


1577 


AF237711 


Drosophila 
melanogaster 


Diablo 


421 


54 


1578 


G00975 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5056. 


480 


100 


1579 


AF248744 


Cryptosporid 
ium parvum 


thrombospondin- related 
adhesive protein 


123 


33 


1580 


AL121782 


Homo sapiens 


dJ585I14.2 (novel protein 
(translation of cDNA 
Em:AK000219) ) 


6^3 


100 


1581 


AF041B53- 


Homo sapiens 


kinesin family member protein 
KIF3A 


345 


33 


1582 


AF025441 


Homo sapiens 


Opa- interacting protein 0IP5 


iida 


100 


1583 


AEO01BO3 


Thermotoga 
maritima 


glycerate kinase, putative 


349 


34 


1584 


AF252283 


Homo sapiens 


Kelch- like 1 protein 


3 973 


100 


15B5 


AF169675 


Homo 
sapiens 


leucine -rich repeat 
transmembrane protein FLRT1 


3494 


99 


1586 


AF118274 


Homo sapiens 


DNb-5 


2628 


97 


1587 


X79440 


Homo sapiens 


NADP+- dependent malic enzyme 


3167 


99 


1588 


X99802 


Homo sapiens 


ZYG homologue 


3966 


99 


1589 


AF169803 


Homo sapiens 


f lavohemoprotein b5+b5R 


2563 


100 


1590 


Y29861 


Homo sapiens 


Human secreted protein clone 
cb98_4 . 


181 


47 


1591 


Z2553S 


Homo sapiens 


nuclear pore complex protein 
hnupl53 


7567 


99 


1592 


X13293 


Homo sapiens 


B-myb protein (AA 1-700) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1594 


AL139314 


Schizosaccha 
romyces 


hypothetical protein 


235 


54 
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pombe 








1595 


W78324 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 81. 


1318 


98 


1596 


Y94906 


Homo sapiens 


Human secreted protein clone 
rb64 9_3 protein sequence SEQ 
ID NO: 18. 


2236 


98 


1597 


AF174605 


Horao sapiens 


F-box protein Fbx25 


140B 


99 


1598 


AB032254 


Homo 
sapiens 


bromodomain adjacent to zinc 
finger domain 2A 


9676 


98 


1599 


X73114 


Homo sapiens 


Slow MyBP-C 


5568 


95 


1600 


X82200 


Horao sapiens 


gpStafSO 


2305 


100 


1601 


Y00876 


Horao 
sapiens 


Human LAPH-1 protein 
sequence. 


1149 


98 


1602 


AJ223351 


Homo sapiens 


H IRA- interacting protein 3 


2821 


99 


1603 


AJ222801 


Horao sapiens 


neutral sphingomyelinase 


2268 


99 


1604 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


1601 


99 


1605 


AF185576 


Mus musculus 


POZ/zinc finger transcription 
factor ODA-8 


3435 


97 


1606 


AF093744 


Homo sapiens 


unknown 


131 


100 


1607 


A12142 


synthetic 
construct 


IFN-pseudo- omega 2 


800 


98 


1608 


Y57949 


Homo sapiens 


Human transmembrane protein 
HTMPN-73. 


1868 


100 


1609 


AF151044 


Homo sapiens 


HSPC210 


681 


97 


1610 


X15218 


Homo sapiens 


ski protein (AA 1 - 728) 


3765 


100 


1611 


Y08200 


Homo sapiens 


rah geranylgeranyl 
transferase 


2976 


100 


1612 


AF220560 


Homo sapiens 


B/K protein 


2486 


99 


1613 


AC004481 


Arabidopsis 
thaliana 


nodulin-like protein 


371 


26 


1614 


Y09501 


Homo sapiens 


NADH-cytochrome-b5 reductase 


1607 


100 


1615 


Y15521 


Homo sapiens 


start position 1 


3150 


97 ! 


1616 


AJ010750 


Rattus 
norvegicus 


Castration induced prostatic 
apoptosis related protein- 1, 
(CIPAR-1) 


890 


62 


1617 


X58079 


Homo sapiens 


S100 alpha protein 


481 


100 


1618 


Y<S^78 


Homo 
sapiens 


Membrane -bound protein 
PRO1009. 


967 


100 


1619 


AJ242973 


Homo sapiens 


peptide methionine sulfoxide 
reductase 


929 


100 


1620 


AF150733 


Homo sapiens 


AD-014 protein 


288 


100 


1621 


AJ007509 


Homo sapiens 


ElB-55kDa-associated protein 


4646 


98 


1622 


X64177 


Homo sapiens 


metallothionein 


380 


100 


1623 


AE001045 


Archaeoglobu 
s fulgidus 


A. fulgidus predicted coding 
region AF0859 


240 


36 


1624 


AL355013 


Schizosaccha 

romyces 

pombe 


mitochondrial carrier protein 


4 03 


34 


1625 


Y66746 


Homo 
sapiens 


Membrane -bound protein 
PR01198. 


1184 


100 


1626 


D90053 


Sus scrofa 


destrin 


863 


100 


1627 


Y35954 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
203. 


756 


100 


1628 


AL031775 


Homo sapiens 


dJ30M3.2 {novel protein) 


470 


100 


1629 


AF132484 


Mus musculus 


unknown 


286 


68 


1630 


AF017094 


Drosophila 
melanogaster 


similar to C. elegans 
R10H10.6 and S, cerevisiae 
YD8419.03C 


4 93 


61 


1631 


X03077 


Homo sapiens 


lactate dehydrogenase -A 


1704 


100 


1632 


AF151084 


Homo sapiens 


HSPC250 


763 


100 


1633 


AJ001874 


Homo sapiens 


orf 


255 


97 


1634 


AC012187 


Arabidopsis 
thaliana 


Contains weak similarity to 
GATA-6 DNA-binding protein 
gb|H36135, gb|Z26200 come 
from this gene. 


143 


38 
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1635 


AF02624£ 


Homo sapiens 


HERV-E integrase 


411 


90 


1636 


Y50943 


Homo sapiens 


Human adult brain cDNA clone 
ve8_l derived protein. 


1126 


95 


1637 


AF134593 


Homo sapiens 


L-pipecolic acid oxidase 


2068 


99 


1638 


AJ238247 


Mus musculus 


putative phosphatase subunit 


1948 


96 


1639 


Y94942 


Homo sapiens 


Human secreted protein clone 
yk251 1 protein sequence SEQ 
ID NO: 90. 


1320 


100 


1640 


AP23503 0 


Homo sapiens 


BM8 8 antigen 


766 


99 


1641 


AF23328 8 


Drosophila 
melanogaster 


WDS 


358 


26 


1642 


M19351 


Mus musculus 


immunoglobulin heavy chain 
binding protein 


145 


34 


1643 


Y70452 


Homo sapiens 


Human membrane channel 
protein-2 (MECHP-2) . 


1352 


100 


1644 


AF176520 


Mus musculus 


WD repeat -containing F-box 
protein FBW5 


2676 


88 


1645 


W67816 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMU42. 


1156 


100 


1646 


X67155 


Homo sapiens 


mitotic kinase-like protein-1 


4456 


99 


1647 


M63180 


Homo sapiens 


threonyl-tRNA synthetase 


1040 


61 


1648 


Y6?342 


Homo sapiens 


Human signal peptide 
containing protein HSPP-119 
SEQ ID NO: 119. 


1566 


93 


1649 


R95332 


Homo sapiens 


Tumor necrosis factor 
receptor 1 death domain 
ligand (clone 3TW) . 


4137 


100 


1650 


AC007136 


Homo sapiens 


Putative map kinase 
interacting kinase 


856 


99 


1651 


AB015346 


Homo sapiens 


EpslSR 


4464 


99 


1652 


AL161576 


Arabidopsis 
thaliana 


putative protein 


1341 


48 


1653 


AC005313 


Arabidopsis 
thaliana 


putative calmodulin 


288 


28 


1654 


AL031428 


Homo sapiens 


dJ184J9.1 (KIAA0601 protein) 


3526 


100 


1655 


AL031428 


Homo sapiens 


dJ184J9.1 (KIAA0601 protein) 


3526 


100 


1656 


AB017910 


Dictyosteliu 
m discoideum 


myoM 


297 


32 


1657 


Y28919 


Homo 
sapiens 


Human regulatory protein 
HRGP-5. 


2251 


99 


1658 


AF055191 


Homo sapiens 


TPA inducible protein 


2744 


98 


1659 


U76846 


Arabidopsis 
thaliana 


ubiquitin-specif ic protease 


137 


35 


1660 


AL078627 


Schizosaccha 

romyces 

pombe 


actin-like protein; (2 act in 
domains) 


320 


34 


1662 


X52022 


Homo sapiens 


collagen type VI , alpha 3 
chain 


16274 


99 


1663 


AF3 00648 


Homo 
sapiens 


guanine nucleotide binding 
protein beta subunit 4 


1811 


100 


1^4 


AF214736 


Homo sapiens 


EH domain containing protein 
2 


2774 


100 


1665 


Z48613 


Saccharomyce 
s cerevisiae 


unknown 


138 


2£ 


1666 


AF177385 


Homo 
sapiens 


cytochrome c oxidase assembly 
protein isoform 2 


1395 


99 


1667 


AC007842 


Homo sapiens 


BC331191_1 


1581 


47 


1668 


S67513 


Boma 
disease 
virus BDV, 
WT-1, Halle 
Bl/91, horse 
brain, field 
isolate, 
Peptide, 370 


p40 


397 


43 
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aa 








1669 


Z99753 


Schizosaccha 

romyces 

porabe 


putative N0Ll-N0P2-sun family 
nucleolar protein 


569 


47 


1670 


G03130 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7211. 


427 


97 


1671 


M96625 


Gallus 
gallus 


cardiac muscle tensin 


1185 


54 


1672 


AF174482 


Homo sapiens 


polycomb 3 


2005 


99 


1673 


Y51846 . 


Homo sapiens 


Human 18.1 homolog protein 
fragment . 


233 


29 


1674 


AF255334 


Homo sapiens 


EXP 3 5 


152 


29 


1675 


Y94867 


Homo 
sapiens 


Human protein clone HP10563 . 


109 


30 


1676 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2. 


3043 


99 


1677 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2. 


1580 


91 


1^78 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 


17 


1679 


AP163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 


17 


1680 


AK024453 


Homo sapiens 


FLJ00045 protein 


1349 


100 


1681 


AF019236 


Dictyosteliu 
m discoideuro 


TipD 


613 


34 


1682 


AJ243459 


Leishmania 
major 


proteophosphoglycan 


153 


26 


1683 


Z69369 


Schizosaccha 

romyces 

pombe 


putative GTP- binding protein 


560 


46 


1684 


X94910 


Homo sapiens 


ERp28 


1334 


100 


1685 


AF286475 


Takifugu 
rubripes 


retinitis pigmentosa GTPase 
regulator-like protein 


196 


19 


1686 


AF191298 


Homo sapiens 


vacuolar sorting protein 35 


4087 


100 


1687 


AJ275986 


Homo sapiens 


transcription factor 


2958 


100 


1688 


AJ275986 


Homo sapiens 


transcription factor 


1886 


88 


1689 


X07311 


Drosophila 
melanogaster 


heat shock protein 


138 


43 


1690 


AF240463 


Rattus 
norvegicus 


LIS1- interacting protein 
NUDE1 


1383 


83 


1691 


AJ272078 


Homo sapiens 


APOBEC-1 stimulating protein 


1256 


68 


1692 


AJ272079 


Homo sapiens 


APOBEC-1 stimulating protein 


1336 


60 


1693 


AF177942 


Xenopus 
laevis 


katanin p60 


1664 


£6 


1694 


AF263539 


Homo sapiens 


arginine N-methyltransf erase 


1774 


100 


1695 


AF222689 


Homo 
sapiens 


protein arginine N- 

methyl transferase 1-variant 2 


1182 


81 


1696 


AK000193 


Homo sapiens 


unnamed protein product 


1060 


100 


1697 


AB041035 


Homo sapiens 


kidney superoxide-producing 
NADPH oxidase 


3122 


100 


1698 


AB041035 


Homo sapiens 


kidney superoxide -producing 
NADPH oxidase 


2181 


100 


1699 


AF025772 


Homo sapiens 


C2H2 zinc finger protein 


488 


54 


1700 


Y44676 


Homo sapiens 


Human ARF- Related Protein- 1 
(HARP-1) . 


938 


97 


1701 


AK022407 


Homo sapiens 


unnamed protein product 


315 


98 


1702 


AB024574 


Homo sapiens 


GTP-binding like protein 2 ! 


1172 


100 


1703 


AF05507B 


Homo sapiens 


zinc finger protein 42 


421 


52 


1704 


AF198092 


Mus musculus 


RP42 


1057 


77 


1705 ' 


AE003573 


Drosophila 
melanogaster 


CG12474 gene product 


161 


33 


1706 


AB036345 


Drosophila 
melanogaster 


aquaporin 


164 


24 


1707 


Y55927 


Homo sapiens 


Human STLK2 protein. 


2146 


100 


1708 


U27121 


Danio rerio 


G12 


212 


47 


1709 


AL391710 


Arabidopsis 


putative protein 


505 


50 
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thaliana 








1710 


D01311 


Homo sapiens 


Human PR0241 polypeptide. 


1649 


97 


1711 


U40750 


Mus musculus 


formin binding protein 30 


45^1 


85 


1712 


AJ011118 " 


Mus musculus 


skeletal muscle and cardiac 
protein 


1490 


89 


1713 


AF255303 


Homo 
sapiens 


membrane- associated nucleic 
acid binding protein 


4416 


99 


1714 


AF255303 


Homo 
sapiens 


membrane-associated nucleic 
acid binding protein 


2960 


100 


1715 


U08227 


Rattus 
norvegicus 


Ras- related protein 


511 


51 


1716 


AF168795 


Rattus 
norvegicus 


schlaf en-4 


1129 


44 


1717 


AF196304 


Homo sapiens 


SUMO- 1- specific protease 


5804 


99 


1718 


AL355737 


Homo sapiens 


HMG20A 


1782 


100 


1719 


AB029333 


Halocynthia 
roretzi 


HrPET-1 


1069 


46 


1720 


AF071317 


Mus musculus 


C0P9 complex subunit 7b 


1297 " 


97 


1721 


AJ272215 


Homo sapiens 


HEYL protein 


1681 


go 


1722 


G01982 " 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6063 . 


718 


i nn 

1UU 


1723 


AL032643 


Caenorhabdit 
is elegans 


similar to TJnehar*arf ptM 7pH 

protein family UPF0034, 


825 


A 1 


1724 


G01972 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6053 . 


586 1 


92 


1725 


Y94441 


Homo 
sapiens 


Human Adipose Specific 
Protein 1. 


1231 


inn 


1726 


AF255443 


Homo sapiens 


CGI-201 protein 


4397 


99 


1727 


AF183426 


Homo sapiens 


HT004 protein 


1810 


99 


1728 


D10884 


Bos taurus 


neurocalcin 


1002 




1729 


Z18529 


Gallus 
gallus 


tens in 


1411 


84 


1730 


Z73423 


Caenorhabdit 
is elegans 


cDNA EST EMBL:Z14908 comes 
from this gene-cDNA EST this 
gene 


233 


41 


1732 


AF090891 


Homo sapiens 


PRO0105 


470 


30 


1733 


AJ277724 


Homo sapiens 


histone deacetylase 8 


2015 


100 


1734 


Q04050 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8131. 


503 


qc 


1735 


D45913 


Mus musculus 


leucine-rich-repeat protein 


3531 


94 


1736 


AF096709 


Drosophila 
virilis 


failed axon connections 
protein 


276 


32 


1737 


AF195120 


Homo sapiens 


dynactin p62 subunit 


2417 


99 


1738 


L15314 


Caenorhabdit 
is elegans 


contains similarity to Pfam 
family PF01772 N=l 


206 


37 


1739 


X54618 


Listeria 

monocytogene 

s 


phosphadidyl inositol specific 

nhoBnhfll 4ria hp* C 


134 


27 


1740 


AL031658 




similar to predicted C. 
elegans an C. intestinalis 
proteins) 




31 


1741 


Y35924 




Extended human secreted 
protein sequence, SEQ ID NO. 
173 . 




99 


1742 


AC013354 


Arabidopsis 
thaliana 


F15H18.15 


202 


32 


1743 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08 . 


1932 


59 


1744 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD06. 


1854 


*1 


1745 


AF221098 


Homo 
sapiens 


Ral guanine nucleotide 
exchange factor RalGPSIA 


1224 


70 


1746 


Y99372 


Homo sapiens 


Human PRO1430 (UNQ736) amino 
acid sequence SEQ ID NO: 116. 


1332 : 


99 


1747 


Y94294 


Homo sapiens 


Human coenzyme A-utilising 


842 


100 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








enzyme CoAEN-2. 






1748 


AK02443£ 


Homo sapiens 


FLJ00026 protein 


1619 


100 


1749 


AE0OO877 


Methanobacte 
riura 

thermoautotr 
ophicum 


conserved protein 


231 


36 


1750 


AF101361 • 


Drosophila 
melanogaster 


Abnormal X segregation 


193 


33 


1751 


Y15067 


Homo sapiens 


ZNF23 2 


869 


100 


1752 


AF251038 


Homo sapiens 


GAP- like protein 


822 


100 


1753 


AC003093 


Homo sapiens 


OX YS TEROL - B I ND ING PROTEIN; 
45% similarity to P22059 
(PID:gl29308) 


352 


57 


1754 


X69089 


Homo sapiens 


165kD protein 


5703 


99 


1755 


AL049795 


Homo sapiens 


dJ622L5.3 (novel protein) 


1039 


100 


1756 


AL031393 


Homo sapiens 


dJ733D15.1 (Zinc-finger 
protein) 


2765 


100 


1757 


AB040672 


Homo sapiens 


UDP-GalNAc: polypeptide N- 

acetylgalactosaminyltransfera 

Be 


2020 


99 


1758 


AL022238 


Homo sapiens 


dJ1042K10.4 (novel protein) 


776 


43 


1759 


AF117653 


Homo sapiens 


double home ob ox protein 


375 


54 


1760 


Y12065 


Homo sapiens 


hNop56 


2959 


99 


1761 


AL049712 


Homo sapiens 


dJ686C3.2 (nucleolar protein 
hNop56) 


2595 


99 


1762 


AC002394 


Homo 
sapiens 


Gene product with similarity 
to dyne in beta subunit 


1542 


51 


1763 


AF169017 


Homo sapiens 


formiminotransf erase 
eye lode ami nase 


877 


100 


1764 


U91541 


Homo sapiens 


human formiminotransf erase 
cyclodeaminase (ftcd)protein, 
carboxy- terminal end 


596 


100 


1765 


AB013365 


Bacillus 
halodurans 


YlqF 


350 


34 


1766 


Y38421 


Homo sapiens 


Human secreted protein , 
encoded by gene No. 36. 


145 


71 


1767 


AC009176 


Arabidopsis 
thaliana 


putative ribulose-l, 5- 
bisphosphate 

carboxylase/oxygenase small 
subunit N- methyl transferase I 


21* 


27 


1768 


AK000647 


Homo sapiens 


unnamed protein product 


737 


99 


1769 


AJ238982 


Homo sapiens 


VNN3 protein 


2665 


99 


1770 


U73522 


Homo sapiens 


AMSH 


1214 


56 


1771 


U89435 


Mus musculus 


unknown 


829 


86 


1772 


S70011 


Rattus sp. 


tricarboxylate carrier 


1604 


95 1 


1773 


AL035086 


Homo sapiens 


dJ44A20.2 (novel protein) 


2036 


100 


1774 


Y99426 


Homo sapiens 


Human PRO1604 (UNQ785) amino 
acid sequence SEQ ID NO: 3 08. 


1057 


99 


1775 


AF110330 


Homo sapiens 


glutaminase 


3146 


100 


1776 


AJ269529 


Homo sapiens 


glycerol 3 -phosphate permease 


2787 


100 


1777 


ZB1579 


Caenorhabdit 
is elegans 


cDNA EST yk76fl.5 comes from 
this gene 


232 


31 


1778 


AY007239 


Homo sapiens 


monooxygenase X 


1875 


99 


1779 


AL109608 


Schizosaccha 

romyces 

pombe 


oxysterol -binding protein 
family 


644 


38 


i *7Q n 
X /oU 


AF254260 


Homo sapiens 


tuftelin 1 


1729 


100 


1781 


L07924 ! 


Mus musculus 


guanine nucleotide 
dissociation stimulator 


247 


50 


1782 


AF295773 


Homo 
sapiens 


ral guanine nucleotide 
dissociation stimulator 


142 


49 . 


1783 


AK024475 


Homo sapiens 


FLJ00068 protein 


433* 


100 


1784 


AK024475 


Homo sapiens 


FLJ00068 protein 


3996 


93 


1785 


G03933 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8014. 


570 


100 


1786 


S82637 


Homo sapiens 


Ig lambda-like gene/beta- 


247 


100 
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TABLE 2 



SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH - 


% 


ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 










glucuronidase exon 11 homolog 







TRADOCS: 1 4 1 6280. 1 ( c /oCT40 ! I . DOC) 
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TABLE 3 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


2 


BL0024D 


class III proteins. 


Olj\J\J ■£ t ±Uo . /(J (J , zb(JS~ 

12 157-1B1 


3 


PR00109 


TYROSINE KINASB 
CATALYTIC DOMAIN 
SIGNATURE 


PRO0109D 17.04 8.085e- 
13 358-381 


4 


BL0 0028 


Zinc finger, C2H2 type, 
domain proteins. 


BLO0028 16.07 9.400e- 
10 1129-1146 BL00028 
16.07 1.257e-09 820- 
837 


5 


BL00023 


Type II fibronectin 
collagen -binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 

1 CM\ 


6 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 8.920e- 

41J-4bO BL00023 
24.31 4.545e-27 353- 
390 


7 


BL00023 


Type II fibronectin 

(-U1 JLayeil-iJJ.riQj.ny auiMin 

proteins . 


BL00023 24.31 8.920e- 
jj 4±J-4bU H1jQ(j023 

24.31 4.545e-27 353- 
i on 


8 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins • 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


9 


O-UU 1XOU 


Kinesin light chain 
repeat proteins . 


BLiUIIduB xy.54 5.119e- 
09 863-917 


10 


PR00464 


E-CLASS P450 GROUP II 
SIGNATURE 


PRQ0464D 17.40 6.182e- 
12 294-312 PR00464G 
12.41 4.231e-ll 377- 
393 


11 


PR00734 


GLYCOSYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4.296e- 
09 502-520 


12 


PF00023 


Ank repeat proteins. 


PF00023B 14.20 6".500e- 
10 89-99 PF00023B 
14.20 2.636e-09 56-66 


14 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3.848e- 
09 79-113 


15 


PR00208 


GLIADIN AND LMW GLUTENIN 


PR00208A 12.59 9.868e~ 
10 517-535 PR00208A 
12.59 2.233e-09 520- 

CIO 


17 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 8.200e- 
14 282-295 PD00066 
13.92 9.400e-14 477- 
490 PD00066 13.92 
o.500e-13 505-518 
PD00066 13 .92 9.500e- 
13 254-267 PD00066 
13.92 1.429e-12 393- 
406 PD00066 13.92 
o.57le-12 421-434 


18 


BL00845 i 


CAP-Gly domain proteins. 


BL00845 16\43 2.200e- 
25 55-80 


20 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BLQ0487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 287-329 


21 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- ' 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 348-390 


22 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- 
26 302-333 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


23 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- 
26 302-333 


25 


BL00115 


Eukaryotic RNA 
polymerase II 
hep tapep tide repeat 
proteins. 


BL00115T 8.45 7.273e- 
29 1208-1242 BL00115Q 
18.08 2.776e-21 953- 
983 BL00115Y 11.86 
8.000e-17 1604-1650 
BL00115M 19.19 8.130e- 
16 731-774 BL00115H 
14.34 9.392e-16 463- 
496 BL00115A 15.44 
7.414e-15 43-82 
BL00115R 6.50 6.128e- 
14 983-1010 BLOOllSJ 
16.71 9.289e-14 591- 
617 BL00115I 8.33 
4.336e-13 535-590 
BL00115L 12.25 5.939e- 

11.65 6.011e-13 435- 
463 BL00115K 15.03 
3.417e-10 617-659 
BL00115O 16.76 5.805e~ 
10 863-913 BL00115P 
11.54 7.538e-10 913- 
953 BL00115S 18.24 
7.968e-10 1010-1052 
BL00115U 10.34 4.475e- 
09 1242-1265 


26 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420A 20.42 4.109e- 
11 81-110 BL00420A 
20.42 8.820e-l0 84-113 


27 


BL00050 


Ribosomal protein L23 
proteins . 


BL00050A 23.71 9.250e- 
27 94-127 BL00050B 
14.81 8.125e-12 133- 
147 


28 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925B 3.73 3.089e- 
10 41-54 


29 


PF00756 


Putative esterase. 


PF00756C 14.12 1.108e- 
09 486-516 


32 


BL00557 


FMN-dependent alpha- 
hydroxy acid 
dehydrogenases proteins. 


BL00557D 17.76 S.O^Se- 
37 274-316 BL00557A 
35.08 8.909e-29 24-73 
BL00557C 15.59 l.OOOe- 
28 227-257 BL00557B 
21.27 8.898e-22 130- 
169 


34 


PR00629 


SHC PHOSPHOTYROSINE 
INTERACTION DOMAIN 
SIGNATURE 


PR00629E 9.90 5.886e- 
35 299-328 PR00629F 
10.95 8.364e-32 334- 
361 PR00629B 13.66 
3.786e-27 224-247 
PR00629A 13.45 8.364e- 
21 206-222 PR00629C 
3.80 4.000e-12 249-261 
PR00629D 12.45 3.739e- 
11 276-286 


35 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
PD01270D 24.66 3 . 700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


36 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION " 


RESULTS* 








PD01270D 24.66 3 . 700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


37 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412C 10.28 9.241e- 
10 264-298 


38 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412C 10.28 9.241e- 
10 264-298 


39 


BL00412 


Neuromodulin {GAP-43) 
proteins . 


BL00412C 10.28 9.241e- 
10 264-298 




PRO03fiO 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380B 12.64 7.356e- 
14 342-360 PR00380C 
13.18 6.927e-13 375- 
394 PR00380D 9.93 
2.180e-12 429-451 
PR00380A 14.18 5 . 154e- 
12 143-165 


44 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 239-290 BL00345A 

223 


A C 


DJJUU J'i J 


C1L.0 — uuntalll pirouciilb . 


RT.nrna^R ?i 9R 1 00Of»- 
40 215-266 BL00345A 
13.96 2.452e-l4 180- 
199 


46 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551A 15.63 3.53Be- 
26 172-202 DM01551C 
14.62 3.571e-l7 232- 
252 DM01551B 8.84 
4.750e-ll 214-226 j 


47 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR0087*B 7.66 9.328e- 
11 246-260 


48 




ZINC- FINGER METAL- 
BINDING NU. 


33 6-45 


50 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 994-1019 BL00972A 
11.93 7.120e-l8 216- 
234 BL00972E 20.72 
9,471e-14 1020-1042 

13 360-375 BL00972B 
9.45 8.269e-10 302-312 


51 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 990-1015 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1016-1038 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


52 


BL01115 


GTP -binding nuclear 
protein ran proteins . 


BL01115A 10.22 3 . 063e- 

1 A in. CA 


53 


PRO0988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 8.500e- 
17 20-38 PR00988F 
12.23 7.828e-15 196- 
210 PR00988C 13.64 
6.108e-14 104-120 
PR00988E 8 .27 3 .872e- 
11 174-186 PR00988D 
5.95 6.878e-10 160-171 
PR00988B 11.60 2.915e- 
09 57-69 


55 


■ PR00742 


CHLORIDE CHANNEL 
SIGNATURE 


PR00762C 9.29 4.682e- 
21 294-314 PR00762D 
11.29 4.103e-19 509- 
530 PR00762A 14.22 
9.333e-18 199-217 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* " ■ 








PR00762F 15.12 3.100e- 
16 563-583 PR00762B 
12.12 6.063e-l6 230- 
250 PR00762E 12.07 
2.286e-15 545-562 
PR00762G 14.13 6.276e- 
13 601-616 


56 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 8.600e- 
10 153-203 


58 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1080-1135 


59 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1062-1117 


61 


PD01929 


KINASE TYPE RESISTANCE 
ANTIBIOTIC TRANSFERASE 
AM. 


PD01929E 10.76 9.018e- 
09 206-221 


68 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 680-693 


69 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 670-683 


70 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 8.714e- 
10 51-64 


72 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL . 


DM00179 13.97 5.304e- 
09 108-118 


73 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239B 25.15 7.075e- 
12 118-166 


74 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 6.1l6e- 
10 93-120 


76 


DM00471 


0 PROKARYOTIC DNA 
TOPOISOMERASE I. 


DM00471A 11.73 9.357e- 
13 53-66 DM00471B 
8 .45 4 ,857e-12 70-81 


80 


PD02876 


DECARBOXYLASE 

PHOS PHATI D YLSER INE . 


PD02876C 8.80 2.723e- 
13 223-236 PD02B76D 
12.13 2.588e-l2 334- 
351 


81 


PD02876 


DECARBOXYLASE 

PHOS PHAT IDYLS ERINE . 


PD02876C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2.588e-l2 393- 
410 


83 


BL00708 


Prolyl endopeptidase 
family serine proteins. 


BL00708B 24.91 7.197e- 
12 570-601 


84 


PR00014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8.043e- 
09 985-1004 


86 


PR00678 


PI 3 KINASE P85 
REGULATORY SUB UN IT 
SIGNATURE 


PR00678H 9.13 1.379e- 
09 246-269 


89 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 8.200e- 
09 264-279 PR00320B 
12.19 8.650e-09 264- 
279 


93 


BL00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 2.588e- 
14 316-332 


95 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4 , OOOe- 
10 123-154 


96 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 


97 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.318e- 
13 134-146 PR00081A 
10.53 2.500e-12 54-72 


98 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 5.500e- 
24 401-423 PR00380D 
9.93 7.188e-20 613-635 
PR003BOB 12.64 7.517e- 
16 529-547 PR00380C 
13.18 2.756e-13 560- 
579 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


102 


"PR00300 


ATP -DEPENDENT CLP 
PROTEASE ATP -BINDING 
SUBUNIT SIGNATURE 


PR00300A 9.56 7.545e- 
14 289-308 


104 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 6.786e- 
18 298-314 BL00479A 
19.86 4.913e-16 155- 
178 BL00479A 19.86 
4.300e-13 272-295 
BL00479B 12.57 6.294e- 
12 181-197 


106 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 8.013e- 
12 43-83 


107 


DM01970 


0 Jew 2X632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 5.000e- 
16 403-416 


108 


BL00191 


Cytochrome b5 family, 
heme-binding domain 
proteins . 


BL00191K 17.38 4.951e- 
27 238-282 BL00191J 
11.37 6.447e-17 182- 
204 


109 


PD01066 


PROTEIN ZINC FINGER 
2 INC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.938e- 
37 8-47 


110 


BL01138 


ScorDion short tovinq 
proteins . 


10 38-50 


113 


BL00107 


Protein kinases ATP- 
bindincr reaion nrnh pinq 


BL00107A 18.39 5.800e- 
23 156-1B7 RT.nm mn 
13.31 9,100e-14 225- 
241 


117 


BL00214 


Cytosolic fatty-acid 
binding proteins . 


BL00214B 26.51 l.OOOe- 
17 46-91 BL00214A 
21.17 7.052e-ll 5-31 


118 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 8.560e- 
13 36-67 


119 


PR00529 


GONADOTROPH IN RELEASING 
HORMONE RECEPTOR 
SIGNATURE 


PR00529C 11.03 7.506e- 
10 158-177 


120 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


121 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR0039flf' 13 01 Q Anno 

09 80-95 


127 


BL00215 


Mi t ochnndri a"L ph^ r*trv 

transfer proteins . 


BLOO^l^A 1^ fl2 7 mno. 

DDUUZl OA 1J . Oi / . IDOC 

13 216-241 


128 


BL01032 


Protein phosphatase 2C 
proteins. 


BL01032C 6.14 3.195e- 
12 147-157 BL01032H 
11.25 5.680e-ll 318- 

331 RT,01032n R 33 

8.932e-ll 282-296 
BL01032I 10 42 fl 902e- 
09 379-389 


129 ! 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 6.694e- 
26 28-64 


130 


PR00990 


RIBOKINASE SIGNATURE 


PR00990B 12.32 9.534e- 
15 47-67 PR00990A 
16.23 5.500e-14 20-42 
PR00990C 12.62 2.412e- 
09 119-133 


133 


BL00880 


Acyl-CoA-binding 
protein. 


BL00880 17.52 5.576e- 
26 72-122 


134 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 9.308e- 
14 18-37 


135 


PR00215 


NEUROMODULIN SIGNATURE 


PR00215C 13.98 6 . 779e- 
10 475-496 


136 


BL01310 


ATP1G1 / PLM / MATS 
family proteins. 


BL01310 14.74 2.4S£e- 
29 71-107 


140 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.882e- 
14 214-231 BL00028 
16.07 9.471e-14 102- 
119 BL00028 16.07 
2.800e-13 18-35 
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0C1Q ID JMU : 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS * 








BL00028 16.07 5.500e- 
















16.07 9.100e-13 186- 








203 BL00028 16.07 
















BL00028 16.07 8.435e- 








12 130-147 BL00028 








16.07 9.217e-12 270- 








287 BL00028 16.07 








6.192e-ll 242-259 








BL00028 16.07 4.000e- 








10 158-175 


141 


BL00501 


Signal peptidases I 
serine proteins. 


BL00501D 16.69 9.538e- 
14 113-133 BL00501C 
9.51 8.688e-10 89-101 


143 


BL01020 


SARI family proteins. 


BL01020C 15.35 7.722e- 
20 79-130 


146 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAIi- 
BINDING NU. 


PD01066 19.43 6.400e- 
25 335-374 


it? 




J O 'Cyulll, I1U<_ Icull Uc 

phosphodiesterases 
proteins . 


RT.Dni5(IP 99 fl7 1 dcria. 

DLtUU JL ZOL ^ 4. . U / 1 , 1 DUG- 

25 509-550 BL00126E 
35.22 3 .951e-16 654- 
709 BL00126D 25.50 
1.360e-15 565-604 
BL00126B 15.20 8.200e- 
11 483-495 BL00126A 
27 56 B 269e-ll 442- 
479 


151 




proteins . 


20 106-149 


154 


BL00559 


Eukaryotic molybdopterin 

oxidoreductases 

proteins. 


BL00559I 13.63 5.304e- 
19 29-58 BL00559K 
13.17 2.957e-18 172- 
199 BL00559J 19.63 

O . J ODC" J. J 33" JL9J. 

BL00559L 13.60 5.814e- 
12 241-259 


155 


PR00449 


TRANSFORMING PROTEIN P21 

t?A<3 ^THNATtrRE 


PR00449A 13.20 1.692e- 
13 13-35 


157 


BL00406 


Actins proteins. 


BL00406D 12.58 2 . 547e- 
18 275-330 BL00406A 
9 95 5 776e-16 15-50 
BL00406B 5.47 7.429e- 
12 69-124 BL00406C 
6.75 9.682e-12 128-183 


160 


BL00132 


Zinc carboxypeptidases, 
zinc-binding region 1 
proteins . 


BL00132A 26.07 7.000e- 
14 22-63 BL00132C 
21.35 3.466e-12 104- 
145 


Us 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12,27 9.043e- 
13 139-158 


168 


BL00362 


Ribosnmal ttrTiF^in SI 5 
proteins . 


BL00362 24.57 9.700e- 
15 129-172 


169 


BL00039 


DRAD- hnx etibfamilv ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 l.OQQe- 
35 640-686 BL00039A 
18.44 1.964e-13 212- 
251 BL00039B 19.19 
4.553e-13 378-404 
BL00039C 15.63 8.773e- 
12 465-489 


175 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3 . 721e- 
12 14-36 


178 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 2.432e- 
29 133-169 


179 


PD01066 


PROTEIW aiWC FINGER 
ZINC- FINGER METAL- 


PDOiO^ Id. 43 £.455e- 
36 6-45 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






BINDING NO. 




180 


PR0O0O7 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 7.429e- 
20 160-180 PR00007A 
19.33 4.938e-19 133- 
160 PR00007C 15.60 
1.225e-15 206-228 
PR00007D 9.64 6.885e- 
11 238-249 


1B1 


BL00027 


* Homeobox 1 domain 
proteins . 


BL00027 26.43 9.526e- 
24 280-323 


182 


BL00027 


1 Homeobox ■ domain 
proteins . 


BL00027 26.43 9.526e- 
24 263-306 


183 


BL00027 


1 Homeobox » domain 
proteins. 


BL00027 26\43 9.526e- 
24 280-323 


184 


BL00027 


■ Homeobox ' domain 
proteins. 


BL00027 26.43 9.526e- 
24 263-306 


188 


PR00929 


AT -HOOK- LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 460-471 


189 


PR00929 


AT - HOOK- L I KE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 440-451 


190 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins . 


BL00383F 15.51 7.188e- 
17 666-682 BL00383A 
13.34 8.714e-17 162- 
177 BL00383E 10.35 

I. 000e-14 333-344 
BL00383E 10.35 7.300e- 

14 628-639 BL00383F 

15 51 1 7?0p-n 171- 
387 BL00383C 10.10 
3.000e-13 217-228 
BL00383D 11.92 7.00Oe- 
13 295-308 BLG0383B 
7.61 1.692e-ll 187-196 
BL00383C 10.10 1.750e- 
09 509-520 BL00383D 

II. 92 4.000e-09 589- 
602 BL00383B 7.61 
8.000e-09 479-488 


191 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR0045OC 12.22 7.911e- 
15 83-105 PROO450C 
12.22 6.286e-13 47-69 


193 


PF00564 


Octicosapeptide repeat 
proteins . 


PF00564B 24.74 6.164e- 
16 227-278 


194 


PR00503 


BROMODOMAIN SIGNATURE 


PR00503D 20.81 9.156e- 
15 204-224 PR00503B 
9.96 9.57le-13 170-187 


195 


BL00901 


Cysteine 

synthase /cystathionine 
beta- synthase P- 
phosphate att . 


BL00901C 20.^3 3.429e- 
18 67-117 


197 


BL0063 6 


Nt-dnaJ domain proteins. 


BL00636A 8.07 6.211e- 
17 40-57 BL00636B 
15.11 2.000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE 


PR00690A 10.86 9.866e- 
09 463-482 


199 


BL01131 


Ribosomal RNA adenine 
dimethylases proteins . 


BL01131A 26.62 2.343e- 
12 84-130 


201 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.352e- 
12 509-522 


203 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.286e- 
10 39-72 


206 


PR00261 


LOW DENSITY LIPOPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR00261A 11.02 4.462e- 
19 65-87 PR00261C 
11.37 9.308e-19 65-87 
PR00261D 12.47 2.667e- 
18 65-87 PR00261B 
14.12 4.000e-18 143- 
165 PR00261A 11.02 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








4.833e-lB 143-165 
PR00261D 12.47 7.500e- 
ir 141-1 fi^ Ppnn!?^CTTi 

AO J. J- O 3 rKvw£0 ID 

14.12 5.065e-16 65-87 
PR00261C 11.37 8.967e- 
16 143-165 PR00261F 
11.57 4.938e-13 143- 
165 PR00261E 11.08 
7.188e-13 65-87 
PR00261F 11.57 7.188e- 
13 65-87 PR00261E 
11.08 1.643e-ll 143- 
165 


209 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors. 


PF00791B 28.49 $ll43e- 
13 118-173 PF00791C 
20.98 7.680e-10 132- 
171 


211 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007A 19.33 5.781e- 
19 131-158 PR00007B 
14.16 4.115e-18 158- 
178 PR00007C 15.60 
1.675e-15 201-223 
PR00007D 9.64 7.231e- 
11 233-244 


212 


BL00183 


Ubicmi-t in^coni ucrat" i ncr 

***** ^ Tl ^* ***** WwiiJ UH d LtXUj 

enzymes proteins . 


RT.nni ft^ *?n q*7 i caco 
30 43-91 


213 


BL00183 


Ubiquitin-conj ugating 
enzymes proteins . 


BL00183 28.97 1.545e- 
30 43-91 


215 


BL00039 


DEAD -box subfamily ATP- 
dependent heli cases 
proteins. 


BL00039D 21.67 1.900e- 

18.44 1.871e-23 21-60 

11 364-388 BL00039B 
19.19 4.064e-ll 277- 
303 


217 


BLOO100 


Chloramphenicol 
acetyl transferase 
proteins. 


BL00100D 17.22 8.484e- 
09 68-106 


219 


PR00213 


MYELIN P0 PROTEIN 
SIGNATURE 


PR00213C 15.94 3.969e- 
11 199-227 


222 


BL00678 


Trp-Asp (WDJ repeat 
proteins proteins. 


BL00678 9.67 1.947e-09 
144-155 


224 


PR00875 


MOLLUSC METALLOTHIONE IN 
SIGNATURE 


PR00875A 5.83 l.OOOe- 
09 901-913 


225 


BL00636 


Nt-dnaJ domain proteins. 


BL00636B 15.11 8.200e- 
19 18-39 


226 


BL00636 


Nt-dnaJ domain proteins. 


BL00635A 8.07 l.OOOe- 
21 21-38 BL00636B 
15.11 8.200e-19 45-66 


229 


PR00301 


70 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00301F 11 Qfl 7 

13 329-346 PR00301G " 

13 78 4 300e-12 361- 

382 


230 


BL00460 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A 28.67 8.773e- 
20 35-70 BL00460B 

9 TK 7 £90j*=»_*|£: *7Q_Cj£ 
/J / .H.42tZ J. O fo "3D 

BL00460C 14.35 2.831e- 
12 111-134 BL00460D 
16.89 8.773e-ll 140- 
160 


231 


PR00647 


SENR ORPHAN RECEPTOR 
SIGNATURE 


PR00647B 10.19 8.522e- 
09 273-287 


233 


BL00292 


Cyclins proteins. 


BL00292B 20.31 7.429e- 
27 244-275 BL00292A 
22.87 7.750e-27 201- 
235 


234 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 6.308e- 
13 7-29 PR00449C 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








17.27 4.462e-ll 47-70 
PR00449D 10.79 7.120e- 
11 109-123 


235 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e- 
10 251-265 PR00019B 
11.36 5.320e-09 119- 
133 PR00019B 11.36 
l.O00e-08 229-243 


236 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e- 
10 245-259 PR00019B 
11.36 5.320e-09 113- 
127 PR00019B 11.36 
1.000e-08 223-237 


237 


PD00289 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 8.448e-09 
67-81 


240 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


241 


PR00O11 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


244 


BL00903 


Cytidine and 
deoxycytidylate 
deaminases zinc -binding 
region s . 


BL00903 12.93 8.941e- 
12 54-64 


245 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13 97 8 04 3e- 
09 124-134 


246 


BL00246 


Wnt-1 family proteins. 


BL00246D 23.97 l.OOOe- 
40 186-239 BL00246E 
20.32 1.000e-40 305- 
351 BL00246B 13.69 
4.176e-36 105-140 
BL00246A 15.75 2.286e- 
24 70-90 BL00246C 
15.56 4.857e-22 150- 
175 


250 


PR00927 


ADENINE NUCLEOTIDE 
TRANS LOCATOR 1 SIGNATURE 


PR00927E 14.93 5.114e- 
10 253-275 


254 


BL00674 


AAA-protein family 
proteins . 


BL00674B 4.46 l.OOOe- 
09 223-245 


255 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 6.045e- 
09 61-88 


256 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002B 15.18 2.8Q0e- 
10 421-435 


258 


PR00094 


ADENYLATE KINASE 
SIGNATURE 


PR00094C 12 94 2 200e- 
18 87-104 PR00094D 
12.52 2.731e-14 161- 
177 PR00094A 10.31 
5.500e-14 11-25 
PR00094B 11.01 4.115e- 
13 39-54 PR00094E 
11.25 7.333e-13 178- 
193 


259 


BL00892 


HIT family proteins. 


BL00892A 18.17 5.500e- 
13 60-91 


262 


BL00388 


Proteasome A- type 
subunits proteins. 


BL00388A 23.14 l.OOOe- 
40 8-54 BL00388B 
31.38 3.864e-33 66-108 
BL00388D 20.71 l.OOOe- 
21 153-184 BL00388C 
18.79 8.147e-16 126- 
148 


264 


BL00903 


Cytidine and 
deoxycytidylate 
deaminases zinc-binding 
region s. 


BL00903 12.93 5.821e- 
09 91-101 


267 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 241-257 


270 • 


BL00226 


Intermediate filaments 
proteins. 


BL00226D 19.10 l.OOOe- 
37 362-409 BL00226B 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








23.86 8.043e-35 196- 
244 BL00226C 13.23 
7.000e-20 261-292 
BL00226A 12.77 6.143e- 
15 96-111 


271 


PD02952 


KINASE TRANSFERASE 
CHOLINE PROTEIN 
MULT I GENE FAMI . 


PD02952C 15.76 9.731e- 
16 235-265 PD02952B 
15.57 5.625e-09 215- 
229 


272 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 l.OOOe- 
40 106-160 PD02929B 
18.36 8.800e-17 179- 
199 


274 


BL01027 


Glycosyl hydrolases 
family 39 proteins. 


BL01027B 15.34 3.486e- 
09 213-250 


275 


PR00424 


ADENOSINE RECEPTOR 
SIGNATURE 


PR00424D 14.32 6.451e- 
11 39-59 


277 


BL00052 


Ribosomal protein S7 
proteins . 


BL00052A 27.85 6.000e- 
12 137-lfld RT.OOO^SR 

J- -J IJ / lOI OJJUVV Jao 

15.17 5.143e-12 208- 
235 


279 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 5.659e- 
13 267-294 


280 


PR00319 


BETA G- PROTEIN 


PR00319D 11.64 6.625e- 
on in«j tic noflnn dp 

13.41 1.000e-21 89-105 
21 51-68 PR00319B 

11 47 fl ?f)flp-1 Q 70-R^ 


281 


PR00319 


BETA G- PROTEIN 
(TRANSDUCING SIGNATURE 


PR00319D 11.64 6.625e- 

13.41 1.000e-21 76-92 
PR00319A 15 27 8 364e- 
21 38-55 PR00319B 
11.47 8.200e-19 57-72 


287 


PF00929 


Exonuclease . 


PF00929D 16.17 7.366e- 
09 149-163 


291 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


292 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


294 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BIND I . 


PD00066 13.92 8.714e- 
12 203-216 


295 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 1^,07 5.500e- 
15 322-339 BL00028 
16.07 9.471e-14 433- 
450 BL00028 16.07 
4.600e-13 648-665 
BL00028 16.07 5.500e- 
13 760-777 BL00028 
16.07 9.5S0e-13 788- 
805 BL0002B 16.07 
3.348e-12 704-721 
BL00028 16.07 6.478e- 
12 461-478 BL00028 
16.07 8.435e-12 844- 
861 BL00028 16.07 

1iD7«C J. J. D y J — O ± U 

BL00028 16.07 2.038e- 
11 211-228 BL0002B 
16,07 5.154e-ll 732- 
749 BL00028 16.07 
5.846e-ll 377-394 
BL00028 16.07 6.885e- 
11 816-833 BL00028 
16.07 7.231e-ll 676- 
693 BL00028 16.07 
9.654e-ll 564-581 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00028 16.07 4.086e- 
09 517-534 BL00028 
16.07 7.429e-09 489- 
506 


296 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 8.333e- 
16 111-136 BL00215A 
15.82 2.723e-ll 10-35 
BL00215B 10.44 9.526e- 
11 152-165 BL00215B 
10.44 7.375e-10 59-72 
BL00215A 15.82 9.824e- 
10 205-230 


302 


PF00953 


Glycosyl transferase. 


PF00953C 19.70 8.773e- 
34 236-269 PF00953A 
19.68 5.000e-25 102- 
129 PF00953B 6.17 
1.000e-13 182-194 


304 


PF00152 


tRNA synthetases class 
II. 


PF00152D 21.30 8.364e- 
28 422-461 PF00152C 
28.03 9.250e-21 220- 
257 PF00152B 15.67 
2.658e-13 159-184 
PF00152A 19.68 5.714e- 
11 44-67 


305 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 8.250e- 
35 37-76 


306 


PD02784 


PROTEIN NUCLEAR 
R I BONUCL EO P ROTE I N . 


PD02784B 26.46 5.840e- 
09 92-135 


307 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


308 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


13 188-212 PR00237G 
19.63 7.207e-13 268- 
295 PR00237A 11.48 
4.375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e- 
10 230-255 PR0D237B 
13.50 9.438e-10 57-79 


309 


BL00522 


DNA •polymerase family X 
proteins. 


BL00522C 11.90 7.577e- 
24 315-339 BL00522F 
14.90 1.310e-15 470- 
494 BL00522A 25.52 
1.265e-14 179-226 
BL00522E~19.63 8.615e- 
14 430-460 BL0052.2B 
27.30 9.625e-12 267- 
313 


310 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 5.235e- 
10 856-897 


312 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 4 . 706e- 
14 151-174 BL00290B 
13.17 9.000e-12 211- 
229 


313 


3L00345 


Ets- domain proteins. 


BL00345B 21.28 l.OOOe- 
40 34-85 BL00345A 
13.96 9.217e-16 1-20 


315 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins . 


PF00651 15.00 5.09-le- 
15 63-76 


317 


BL01020 


SARI family proteins. 


BL01020C 15.35 3.198e~ 
17 79-130 


318 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.696e- 
11 164-214 


320 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 


PR00109B 12.27 4 . 814e- 
10 216-235 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 






SIGNATURE 




321 


BL00027 


• Homeobox 1 domain 
proteins . 


BL00027 26.43 5.688e- 
10 329-372 


322 


PR001O9 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 B.765e- 
12 558-577 


324 


BL01241 


Link domain proteins. 


BL01241 35.81 8.313e- 
30 183-236 BL01241 
35.81 3.222e-13 282- 
335 


326 


BL00412 


Neuromodulin (GAP -43) 
proteins . 


BL00412D 16.54 4.000e- 
12 515-566 BL00412D 
16.54 5.705e-ll 516- 
567 BL00412D 16.54 
7.848e-10 518-569 
BL00412D 16.54 1.827e- 
09 514-565 BL00412D 
16.54 1.9l8e-09 513- 
564 BL00412D 16.54 
2.102e-09 520-571 


328 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.557e- 
20 151-199 BL00232B 
32.79 2.246e-18 41-89 
BL00232B 32.79 5.985e- 
18 370-418 BliOO^^PR 
32.79 5.500e-16 258- 
306 BL00232B 32 79 
9.384e-15 475-523 
BL00232C 10.65 2.537e- 
12 256-274 BL00232C 
10.65 4.326e-ll 368- 
386 BL00232C 10.65 
7.261e-ll 473-491 
BL00232C 10.65 7.457e- 
11 39-57 


330 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


331 


BL00598 


Chromo domain proteins. 


BL00598 14.45 8.393e- 
18 27-49 


333 


BL01016 


Glycoprotease family- 
proteins. 


BL01016C 22.84 3 . 925e- 
32 70-115 BL01016E 
14.88 5.286e-19 149- 
177 BL01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3.29Be- 
11 127-140 BL01016G 
7.14 5.622e-10 261-271 
BL01016A 5.65 7.167e- 
10 4-19 BL01016F 
13.34 1.563e-09 200- 
212 BL01016B 8.93 
8.855e-09 38-50 


339 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.500e- 
11 17-61 


340 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.231e- 
33 10-49 


341 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5.042e- 
09 55-109 


342 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.400e- 
30 16-55 


343 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 l.OOOe- 
40 20-68 


346 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.764e- 
11 135-154 


347 


PR00109 


TYROSINE KINASE 


PR00109B 12.27 4.764e- 
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NO. 


DESCRIPTION 


RESULTS* 






CATALYTIC DOMAIN 
O luNAl UKb 


11 135-154 


351 


BL011B7 


Calcium- binding EGF-like 
domain proteins pattern 
proteins. 


BL01187B 12.04 1.783e- 
13 100-116 BL01187B 
12.04 8.435e-13 276- 
292 BL01187B 12.04 
8.800e-ll 13-29 

BIjUIXc Id lz.U4 7.429e- 
10 54-70 BL01187B 

1 ? c\a c Tjc a no "5 11 

247 BL01187A 9.9B 

/ - U U VJC \J y Z> 3 Z D r 


352 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.950e- 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


354 


BL00380 


RVioHsnp^QP 1 nrnhpi tig 
luiuuaucoc pi ULclIls . 


m nni«ni? o nc — ~d ccia<* 

DLuujour s. /© b ,o?46- 

11 542-553 


355 


PF00628 


PHD- finger . 


PF00628 15.84 l.OOOe- 
11 116-131 


356 


PR00587 


SOMATOSTATIN RECEPTOR 
TYPE 1 SIGNATURE 


PR00587A 8.06 9.700e- 
09 17-37 


359 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 4.4^2e- 
15 261-274 PD00066 
13.92 6.500e-13 233- 
246 PD00066 13.92 
4.300e-09 289-302 


361 


PF00791 


Domain present in ZO-1 
and Unc5-like net r in 
receptors . 


PF00791B 28.49 9.604e- 
13 54-109 PF00791B 
28.49 1.095e-12 21-76 
PF00791A 27.85 1.432e- 
09 71-126 PF00791B 
28.49 7.440e-09 184- 
239 


362 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 


PF00791B 28.49 2.273e- 
11 279-334 


363 


PR00450 


RECOVER IN FAMILY 


PR00450C 12.22 5.080e- 
10 73-95 PR00450C 
12.22 3.278e-09 109- 
131 


364 


PF0O242 


DNA polymerase (viral) 
N- terminal domain 

n i^t^ hoi no 


PF00242Q 13.51 2.328e- 
09 22-68 


365 


PF00242 


DNA polymerase (viral) 
N- terminal domain 
proteins . 


PF00242Q 13.51 2.328e- 
09 22-68 


366 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 1038-1092 


367 




SIGNATURE 


PR00019B 11.36 1.360e- 
09 229-243 PR00019B 
11.36 6.040e-09 91-105 
PR00019A 11.19 8.667e- 

AO 1*7f\ T Q A 


368 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 g.oooe- 
15 30-49 PROOOllA 
14.06 9.830e-15 30-49 
PROOOllB 13.08 4.500e- 
14 30-49 PROOOllC 
24.25 5.143e-09 6-35 


369 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032H 11.25 4.150e- 
09 417-430 


372 


BL00478 


LIM domain proteins. 


BLO0478B 14.79 7.750e- 
12 410-425 


373 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 9.757e- 
34 26-65 


376 


PR00170 


SODIUM CHANNEL SIGNATURE! 


PROO170E 6.48 2.739e- ™ 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








10 88-118 


380 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 l.OOOe- 
23 276-307 BL00107B 
13.31 1.692e-12 342- 
358 


381 


BL00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 5.714e- 
12 50-66 


382 


PR00624 


HI STONE H5 SIGNATURE 


PR00624G 4.08 4.900e- 
09 524-544 


384 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.950e- 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


385 


PR00511 


TEKTIN SIGNATURE 


PR00511D 7.11 5.371e- 
09 67-80 


386 


PD02870 


RECEPTOR INTERLEUKIN- 1 
PRECURSOR . 


PD02870B 18.83 6".000e- 
10 97-130 


388 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BIND I . 


PD00066 13.92 5.000e- 
13 516-529 


389 


BL00290 


Immunoglobulins and 
major histocompatibility 


BL00290A 20.89 7.667e- 
09 151-174 


390 


BL00215 


Mitochondrial energy 

hi*ansfpr nrrthpin« 


BL00215A 15.82 5.200e- 

J. D ££l*ZiO a Li U U ^ J. 3/i 

15.82 7.613e-14 20-45 

11 123-148 BL00215B 
10.44 9,526e-ll 69-82 
BL00215B 10.44 7.300e- 
09 272-285 BL00215B 
10.44 8.500e-09 165- 
178 


394 


BL00674 


AAA-protein family 
proteins. 


BL00674B 4.46 2.723e- 
16 299-321 


397 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10 52 8 579e- 
11 141-155 


398 


PR00761 


BIND IN PRECURSOR 
SIGNATURE 


PR00761B 9.93 6.764e- 
09 55-74 


399 


BL00240 


Receptor tyrosine kinase 
clas3 III proteins. 


BL00240B 24.70 7,907e- 
10 118-142 


401 


PF00676 


Dehydrogenase El 
component . 


PF00676B 24.71 8.071e- 
18 331-369 PF00676D 
14.40 3.854e-15 486- 
3uo rruub /oL lo .oo 
9.182e-14 454-478 


402 


BL00514 


Fibiri nrtcrpn hpha anrf 

gamma chains c- terminal 
domain proteins. 


28 4432-4469 BL00514G 
15.98 6.092e-14 4555- 
4585 BL00514D 15.35 
2.532e-12 4473-4486 
BL00514F 11.65 4.288e- 
10 4519-4534 BL00 514H 
14.95 4.955e-10 4584- 
4609 


403 


PF00992 


Troponin . 


PF00992A 16.67 5.974e- 
09 105-140 


404 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.450e- 
10 73-87 PR00019A 
11.19 8.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 50-64 PR00019B 
11.36 1.000e-09 96-110 


405 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.557e- 
20 139-187 BL00232B 
32.79 2.246e-18 29-77 
BL00232B 32.79 5.985e- 
18 358-406 BL00232B 
32.79 5.500e-16 246- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRI PTION 


RESULTS* 








294 BL00232B 32.79 1 
9.384e-15 463-511 

12 244-262 BL00232C 
10.65 4.326e-ll 356- 
374 BL00232C 10 fi5 
7.261e-ll 461-479 
BL00232C 10.65 7.457e- 
11 27-45 


407 


PF00426 


Outer Caps id protein. VP4 
(Hemagglutinin) . 


PF00426S 15.67 5.634e- 
09 902-940 


409 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.695e- 
09 126-180 


410 




Cli lani nf»- nilfl £* r»*~ Hp 

dissociation stimulators 

f"*nf 1 94 faniil v n 4 on 


BI»00741R 14 97 9 7Hp_ 
09 252-275 


411 


PF00646 


F-box domain proteins. 


PF00646A 14.37 6.344e- 
09 86-100 


412 


BLO0*03 


Thymidine kinase 
cellular-type proteins. 


BL00603B 11.39 8.500e- 
09 542-557 


415 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins. 


BL00866B 36.29 3 . 571e- 
31 245-291 BL00866C 
23.26 9.000e-25 331- 
366 


418 


PR0023 9 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 6.114e- 
09 590-602 


421 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors. 


PF00791B 28.49 7 . 955e- 
14 23-78 PF00791B 
28.49 3.653e-12 273- 
328 PF00791B 28.49 
4.273e-ll 156-211 
PF00791B 28.49 7.818e- 
11 89-144 PF00791B 
28.49 1.524e-10 56-111 
PF00791C 20.98 3.559e- 
09 37-76 PF00791C 
20.98 5.235e-09 170- 
209 PF00791C 20.98 

PF00791B 28.49 6.202e- 

OQ 1 RQ-944 PP0fi7Q1'R 

28.49 7.028e-09 435- 
490 PF00791B 28.49 
8.679e-09 367-422 


424 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 7.207e- 
28 1645-1679 


425 


PR00109 


TYROSINE KINASE 
SIGNATURE 


PR00109D 17.04 5.881e- 
in 99fl-9»;i 


49 Q 




/jinc iincjcxf type 
(RING finger), proteins. 


11 31-40 


471 




L'ci/iij-Dox Buyiaiiiiiy a in- 
dependent heli cases 
proteins. 


34 490-536 BL00039A 
18.44 5.6l5e-19 205- 

944 RT.00fi"iQR 1 Q 1Q 

8.920e-16 251-277 

RTirtflfllQf 1 m G7 q 7fl1 q_ 

15 333-357 


432 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 7.652e- 
12 169-185 


433 


PR00828 


FORM IN SIGNATURE 


PR00828B 5.23 8.218e- 
10 382-405 


436 


BL00415 


Synapsins proteins. 


BL00415N 4.29 8.643e- 
11 195-239 BL00415N 
4.29 3.036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR00834F 10.91 6.040e- 
11 221-234 


446 


PF01140 


Matrix protein (MA) , 


PF01140D 15.54 9.663e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






P15. 


10 183-218 PF01140D 
15.54 3.093e-09 246- 
281 


449 


PR0O568 


DOPAMINE D3 RECEPTOR 
SIGNATURE 


PR00568G 13.95 5.551e- 
09 39-53 


451 


PF00084 


Sushi domain proteins 
{SCR repeat proteins. 


PF00084B 9.45 3.813e- 
10 47-59 


452 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 618-649 


456 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 l.OOOe- 
25 77-99 PR00380D 
9.93 1.00Qe-21 281-303 
PR00380C 13.18 8.286e- 
17 230-249 PR00380B 
12.64 4.724e-16 194- 
212 


457 


PR00253 


GAMMA-AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 9.143e- 
24 246-267 PR00253B 
13.47 2.000e-23 272- 
294 PR00253C 13.85 
7.000e-23 306-328 
PR00253D 16.68 5.950e- 
21 452-473 


467 


PR00849 


GLYCOSYL HYDROLASE 
FAMILY 58 SIGNATURE 


PR00849D 9.77 9.236e- 
09 910-937 


471 


BIi00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 8.200e-12 
33-44 


472 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 3.721e- 
09 282-330 


473 


BL00344 


GATA- type zinc finger 
domain proteins . 


BL00344 17.99 7.000e- 
12 814-852 


474 


BL00481 


Thiol -activated 
cytolysins proteins. 


BL00481E 13.07 8.909e- 
09 173-199 


479 


PR00319 


BETA G- PROTEIN 
( TRANS DUC IN) SIGNATURE 


PR00319B 11.47 2.571e- 
09 393-408 


480 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.900e- 
38 8-47 


481 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405C 19.41 l.OOOe- 
19 451-473 PR00405B 
11.83 4.333e-18 430- 
448 PR00405A 17.71 
4.971e-18 411-431 


482 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.286e- 
10 959-974 PR00049D 
0.00 9 857e-10 958-973 
PR00049D 0.00 1.305e- 
09 937-952 PR00049D 
0.00 8.322e-09 939-954 


486 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00OO7B 14.16 8.615e- 
23 653-673 PRG0007A 
19.33 6.192e-22 626- 
653 PR00007C 15.60 
5.846e-19 698-720 
PR000O7D 9.64 3.647e- 
13 732-743 


487 


PD00567 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567B 18.23 2.853e- 
09 200-214 


468 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e- 
12 3-21 


489 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.882e- 
27 30-69 PD01066 
19.43 3.430e-lb 71-110 


490 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.86*46- 
09 663-678 


492 


BL01128 


Shikimate kinase 
proteins . 


BL01128A 18.84 6.464e- 
17 58-92 


497 


PF00429 


ENV polyprotein (coat 


PF00429 31.08 7.171e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 






polyprotein) . 


15 21-71 


498 


BL00120 


Lipases, serine 
proteins. 


BL00120B 11.37 7.923e- 
09 185-200 


500 


" BL00030 


Eukaryot ic RNA- binding 
region RNP-1 proteins. 


BL0003OA 14.39 7.353e- 
11 299-318 


501 


BL01159 


WW/rsp5/WWP domain 
proteins. 


BL01159 13.85 8.579e- 
12 131-146 


505 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 3.739e- 
17 492-510 


508 


PR00120 


H+TRANS PORTING ATPASE 
(PROTON PUMP) SIGNATURE 


PR00120C 9.90 5.800e- 
19 705-722 


509 


DM01417^ 


6 kw INDUCING XPMC2 
MUSHROOM SPAC22G7.04. 


DM01417E 20.62 2.938e- 
16 362-395 DM01417D 
11.08 3.800e-13 322- 
338 


510 


PF00534 


Glycosyl transferases 
group 1. 


PF00534B 14.47 6.625e- 
09 346-370 


511 


PF00534 


Glycosyl transferases 
group 1. 


PF00534B 14.47 6.625e- 
09 293-317 


512 


PF0D534 


Glycosyl transferases 
group 1, 


PF00534B 14.47 6.625e- 
09 366-390 


513 


PD01841 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- " 
40 110-160 PD01841B 
14.35 1.0O0e-40 181- 
222 PD01841D 17.87 
1.000e-40 243-295 
PD01841F 13.36 l.OOOe- 
40 333-382 PD01841G 
24.26 l.OOOe-40 386- 
440 PD01841L 18.42 
l.OOOe-40 968-1010 
PD01841I 23.00 4.545e- 
37 762-804 PD01841E 
18.60 3 .750e-36 295- 
333 PD01841J 14.94 
6.023e-35 851-888 
PD01841H 21.30 2 . 909e- 
33 490-527 PD01841K 
14.81 7.088e-33 924- 
954 PD01841C 13 . 78 
9.386e-23 222-243 
PD01841M 10.82 8.594e- 
21 1054-1073 PD01841I 
23.00 2.667e-13 549- 
591 


514 


PR00153 


CYCLOPHILIN PEPTIDYL- 
PROLYL CIS -TRANS 
ISOMERASE SIGNATURE 


PR00153C 11.01 7.188e- 
13 95-111 PR00153E 
9.10 4.150e-12 122-138 


515 


BL00740 


MAM domain proteins. 


BL00740A 13.87 7.18Se- 
12 410-423 


516 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.087e- 
12 1018-1052 


517 


BL00242 


Integrins alpha chain 
proteins. 


BL00242C 16.86 8.320e- 
09 12-42 


523 


DM00031 


IMMUNOGLOBULIN V REGION.- 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 1.000e-25 84-118 


525 


BL00319 


Amyloidogenic 
glycoprotein 
extracellular domain 
proteins . 


BL00319C 17.12 8.375e- 
10 61-95 


526 


PF00789 


Domain present in 
ubiqui tin -regulatory 
proteins. 


PF00789B 19.70 3.308e- 
12 322-343 PF00789C 
20.98 5.269e-09 367- 
392 


528 


BL01162 


Quinone oxidoreductase / 

zeta-crystallin 

proteins. ! 


BL01162C 22.80 1.560e- 
16 120-164 



207 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


529 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 3.893e- 
09 60-73 


532 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 123- 
148 


533 


BL00215 


Mitochondrial energy- 
transfer proteins . 


BL00215A 15. B2 4.000e- 
17 11-36 BL00215A 
15.82 B.660e-ll 97-122 


534 


BL0009B 


Thiolases acyl -enzyme 
intermediate proteins. 


BL00098C 21.65 2.800e- 
38 181-227 BL00098B 
32.59 5.345e-38 86-141 
BL00098D 26.30 8.364e- 
35 245-288 BL00098E 
22,12 1.000e-34 314- 
352 BL00098F 10.18 
4.971e-22 365-386 
BL00098A 10.60 6.455e- 
11 38-50 


535 


PR00370 


FLAVIN -CONTAINING 
MONOOXYGENASE (FMO) 
SIGNATURE 


PR00370E 11.96 7.429e- 
22 321-340 PR00370D 
16.33 6.143e-21 185- 
204 PR00370F 17.75 
6.559e-21 376-396 
PR00370B 10.91 9.591e- 
21 27-46 PR00370C 
12.72 3.500e-2D 140- 
157 PR00370A 3.35 
6.442e-17 4-20 


536 


BL00028 


Zinc finger, C2H2 type, 

Horns "in nrnfoinc 


BL00028 16.07 7.429e- 

16.07 6.294e-14 341- 
■^^p nT.nnn^n *i g rw 

03D DilUUUto J.O . U / 

1.346e-ll 369-386 
BL00028 16 07 1 fiQ'Jta- 
11 397-414 BL0002B 
16.07 4.462e-ll 453- 
470 BL00028 16.07 
7.231e-ll 425-442 
BL00028 16.07 4.30Oe- 
10 313-330 


537 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 844-881 


538 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 819-856 


539 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 822-859 


54 0 


PR00985 


LEUCYL-TRNA SYNTHETASE 
SIGNATURE 


PR00985A 12.10 9.000e- 
10 357-375 


541 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 16.74 l.OOOe- 
40 3-47 PD02102B 
18.28 4.375e-34 57-100 
PD02102D 21.69 1.923e- 
30 179-218 PD02102C 
26.34 8.929e-26 100- 
146 


543 


BL00028 


Zinc f 7 "! ncrf^T* r~ vn<=> 
domain proteins. 


10 48-65 BL00028 
16.07 6.400e-10 193- 
210 BL00028 16.07 
1.000e-09 343-360 
BL00028 16.07 6.914e- 
09 78-95 


545 


BL00250 


TGF-beta family 
proteins. 


BL00250A 21.24 8.000e- 
31 293-329 BL00250B 
27.37 5.286e-24 354- 
390 


547 


PR00319 


BETA G- PROTEIN 


PR00319B 11.47 2.714e- 



208 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 






(TRANSDUCIN) SIGNATURE 


09 186-201 PR00319A 
15.27 7.344e-09 210- 
227 


548 


BL01204 


NF-kappa-B/Rel /dorsal 
domain proteins . 


BL01204A 17-74 l.OQOe- 
40 8-56 BL01204D 
16.42 1.000e-40 177- 
221 BL01204E 13.83 
7.652e-30 225-250 
BL01204C 13.93 8.714e- 
22 141-160 BL01204B 
15.41 4.333e-16 102- 
116 


549 


PR00326 


GTP1/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.364e- 
15 255-276 


551 


PF00632 


HECT-domain (ubiquitin- 
transferase) . 


PF00632C 20.66 3.302e- 
23 1569-1601 PF00632B 
18.45 3.700e-21 1515- 
1543 


554 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins . 


BL00290B 13.17 l,600e- 
14 187-205 BL00290A 
20.89 2.059e-14 130- 
153 


557 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.339e- 
09 846-879 


559 


DM01111 


4 kw PHOSPHATASE 
TRANSFORMING 61K PDF1 . 


DM01111L 11.93 3.762e- 
09 7-35 


562 


PF00658 


Poly- adenylate binding 
protein, unique domain 
proteins . 


PF00658C 16 33 9 455e- 
32 118-155 


564 


BL00141 


Eukaryotic and viral 
aspartyl proteases 
proteins . 


BL00141A 12.10 4.150e- 
10 472-488 


566 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.667e- 
15 272-289 


567 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.977e- 
13 229-268 


569 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


570 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 

J. J . J J. iOJ" 

199 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 454-483 PR00193C 
12.60 2.636e-31 223- 
251 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 

537 


573 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14 36 1 857e- 
34 470-499 PR00193C 
12 60 2 636e-31 239- 
267 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 524- 
553 


575 


BL00752 


XPA protein. 


BL00752B 19.17 9.703e- 
10 885-929 


576 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 7.000e- 
09 276-295 


577 


BL00116 


DNA polymerase family B 


BL00116A 12.81 5.737e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins . 


13 864-877 BL00116B 

11 • 1.3£7C~1Z 73^- 

965 


578 


BL00195 


Glutaredoxin proteins. 


BL00195B 15.31 7.158e- 
09 121-141 


579 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 9.000e- 
11 217-231 PR00019B 
11.36 1.360e-09 386- 
Ann DRfinm qb ii i q 

3.333e-09 389-403 
PR00019B 11.36 8.920e- 
09 363-377 


580 


PR00253 


GAMMA- AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 2.125e- 
25 275-296 PR00253B 
13.47 7.923e-24 301- 
323 PR00253D 16.68 
5.846e-23 444-465 

20 335-357 


583 


PR00343 


SELECTIN SUPERFAMILY 
COMPLEMENT- BINDING 
REPEAT SIGNATURE 


PR00343C 16.85 2.286e- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16.85 
5.500e-ll 783-802 
PR00343C 16,85 4.246e- 

16.85 8.230e-10 1686- 

1 rUJ 


584 


DM01537 


kw SKI2W SKI2 NUCLEOLAR 
HELICASE. 


DM01537B 21.63 1.878e- 
37 79-126 DM01537B 
21.63 9.491e-30 916- 

3.186e-ll 784-804 


586 


PF00013 


KH domain proteins 
family of RNA binding 
proteins . 


rruuuu d . to 1 .flDUc-Uj 
124-136 


587 


DM00892 


3 RETROVIRAL PROTEINASE . 


DM00892C 23.55 4.409e- 


589 


BL00478 


LIM domain proteins . 


BL00478B 14.79 1.643e- 
13 261-276 BL00478B 
14.79 7.709e-09 321- 
336 


590 


PF00855 


PWWP domain proteins. 


PF00855 13.75 8.000e- 
15 931-948 


"'"591 


PF00855 


)?WWP domain proteins. 


riUUD33 J. J . / D OiUUUC 

15 1062-1079 


593 


PF00628 


PHD- finger. 


rruvo^o 13 i Oft j . fiODc~ 
12 424-439 


594 


PR00205 


CADHERIN SIGNATURB 


PR002fl e »'R 11 7Q 9 o/i a 
rRVUZ U 3D 1 J. . j ? Z . Z*iie- 

16 558-576 PR00205A 
14.73 9.308e-13 542- 
558 PR00205C 13.65 
5.304e-12 594-609 
PR00205B 11.39 4.273e- 
10 336-354 


596 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.789e- 
18 307-338 


598 


PD01675 


GLYCOPROTEIN MAJOR 
ENVELOPE PROBABLE U3 . 


PD01675C 19 89 2 330e- 
10 55-89 


600 


BL00242 


Integrins alpha chain 
proteins . 


BL00242E 9.03 9.591e- 
27 985-1014 BL00242C 
16.86 4.115e-26 286- 
316 BL00242D 13.57 
4.150e-25 357-382 
BL00242B 8.13 7.353e- 
12 189-199 BL00242D 
13.57 3.455e-ll 421- 
446 BL00242A 13.80 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








5.000e-ll 61-73 
BL00242D 13.57 4.986e- 
10 291-316 


601 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 5.610e- 
09 198-213 


602 


PR00278 


PANCREATIC HORMONE 
SIGNATURE 


PR00278A 12.43 4 . 569e- 
10 331-348 


603 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479C 12.01 3.250e- 
12 170-183 


604 


BL00315 


Dehydrins proteins. 


BL00315A 9.35 1.672e- 
09 424-452 


605 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e- 
10 295-339 


606 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 l.OOOe- 
13 335-358 


608 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.167e- 
15 265-282 


609 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.167e- 
15 211-228 


612 


DM01206 


CORONAVIRUS NUCLEOCAPSID . 
PROTEIN. 


DM01206B 10.69 7.411e- 
10 877-897 DM01206B 
10.69 8.027e-10 861- 
881 DM01206B 10.69 

j . ij / e; x u q i j o j j 

DM01206B 10;69 1.456e- 
09 859-879 DM01206B 
10.69 1.797e-09 879- 
899 DM01206B 10.69 
4 076e-09 865-885 
DM01206B 10.69 7.038e- 
09 898-918 DM01206B 
10.69 7.949e-09 871- 
891 DM01206B 10.69 
8.291e-09 767-787 


615 


PD02699 


PROTEIN DNA-B INDING 
BIMDTNG DNJX 


PD02699A 8.91 2.023e- 
28 129-158 PD02699C 
24.84 1.000e-27 317- 
364 PD02699B 18.28 
1.000e-17 158-182 


616 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PP00380B 12 64 2 241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- | 
455' 


617 




KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14 18 4 0B6P- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455 


618 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN . 


DM01206B 10.69 5.143e~ 
12 531-551 DM01206B 
10.69 2.603e-10 535- 
555 


621 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 3.160e- 
21 561-582 


622 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239F 28.15 3.222e- 
10 647-692 BL00239C 
18.75 8. 304e-10 543- 
566 


623 


PR00407 


EUKARYOTIC MOLYBDO PTERIN 
DOMAIN SIGNATURE 


PR00407K 9.94 8.448e- 
09 326-339 


624 


BL00641 


Respiratory-chain NADH 
dehydrogenase 75 Kd 


BL00641C 21.10 l.OOOe- 
40 157-202 BL00641E 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






subunit proteins . 


24.37 1.000e-40 255- 
308 BL00641F 33.12 
l.OOOe-40 571-623 
BL00641A 17.15 1 . 818e- 
37 48-80 BL00641B 
12.62 5.846e-34 113- 
139 BL00641D 13.23 
9.308e-29 216-240 


£27 


PR00103 


CAMP-DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR00103E 17.80 2.500e- 
18 367-380 PR00103B 
13.39 2.080e-14 297- 
312 PR00103A 9.59 
2.9S7e-14 282-297 
PR00103D 10.83 3.077e- 
12 346-358 PR00103C 
15.68 1.000e-ll 334- 
344 PR00103B 13.39 
1.450e-ll 175-190 
PR00103A 9.59 1.720e- 
10 160-175 


630 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081A 10.53 6.211e- 
16 4-22 


631 


PP00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 8.500e- 
14 37-50 


632 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10. £9 2.233e- 
10 1324-1344 DM01206B 
10.69 4.822e-10 1276- 
1296 DM01206B 10.69 
7.658e-10 1328-1348 
DM01206B 10.69 8.274e- 
10 1280-1300 DM01206B 
10.69 4.532e-09 1320- 
1340 DM01206B 10.69 
7.266e-09 1326-1346 


635 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.600e- 
23 145-176 BL00107B 
13.31 2.636e-13 211- 
227 


636 


BL00657 


Fork head domain 
proteins. 


BL00657A 19.39 1.545e- 
30 101-143 BL00657B 
22.27 7.750e-26 149- 
192 


637 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
10 607-623 


643 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 4.913e-09 
199-212 


647 


PF00628 


PHD- finger . 


PF00628 15.84 2.350e- 
13 385-400 PF00628 
15.84 3.455e-12 464- 
479 


"648 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins . 


BL01129E 13 .25 4.000e- 
25 332-357 BL01129C 
25.56 8.200e-23 236- 
279 BL01129B 12.51 
6.118e-13 191-212 


649 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 3.908e- 
10 455-480 


650 


BL00027 


'Homeobox' domain 
proteins. 


BL00027 26.43 6.684e- 
13 771-814 


651 


BL50002 


Src homology 3 (SH3 ) 
domain proteins profile. 


BL50002A 14.19 1.750e- 
12 1026-1045 


653 | 


PR00253 


GAMMA-AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 4.000e- 
24 253-274 PR00253C 
13.85 8.800e-24 313- 
335 PR00253B 13.47 
3.143e-22 279-301 
PR00253D 16.68 7.652e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








fcW T 1 J 


654 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452e- 
11 969-997 PD01719A 

156 PD01719A 12.89 
7.395e-10 1276-1304 
PD01719A 12.89 1.222e- 
09 1220-1248 


657 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(AhooJc) . 


BL00354C 6.61 8.397e- 
09 563-578 


658 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 580-595 


659 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e- 
13 539-572 DM00215 
19.43 4.750e-12 549- 

582 DM00215 19.43 
9.824e-ll 551-584 

10 548-581 DM00215 

583 DM00215 19.43 

J . J J y C .L v S 9& ? O 9 

DM00215 19.43 7.107e- 
10 544-577 


660 


PR00688 


XYLOSE ISOMERASE 
SIGNATURE 


PR0068BI 13.78 9.518e-~ 
09 224-236 


661 


BL00027 


•Homeobox 1 domain 
proteinB . 


BL00027 26.43 5.950e- 
23 249-292 


£o2 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 | 


663 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


664 


PRO 03 60 




iyjxuujouo j. j , ©jl /.j.Doe- 
10 596-610 


666 


PR00819 


CBXX/CFQX SUPERFAMILY 
o ±.\jvit\ i unci 


PR00819B 10.83 8.988e- 

in *7ft A *7*3 ft 


667 


BL50040 


Elongation factor 1 
gamma chain profile. 


BL50040C 22.62 2.143e- 
16 135-178 


668 


PR00019 


LEUCINE-RICH REPEAT 


PR00019B 11.36 1.360e- 

AO 1 1 O A CI f*T5/\ftrt1 ft 7k 

Uy liy-lbj PROU019A 
11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 


670 


BL00018 ] 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 3.250e-10 
681-694 BL00018 7.41 
6.400e-10 717-730 


672 


PD00131 


ATP -BINDING TRANSPORT 
TRANSMEMBR. 


PD00131B 34.97 1.0-00e- 
34 356-410 PD00131C 
19.59 1.346e-26 504- 
542 


673 


PR00667 


RETINAL PIGMENT 
EPITHELIUM-RETINAL GPCR 


PR00667G 15.33 7.557e- 
10 106-123 


674 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 593-608 PR00320B 
12 19 4 115e>-17 615- 
650 PR00320C 13.01 
8.435e-ll 717-732 
PR00320C 13.01 2.800e- 
10 635-650 PR00320C 
13.01 6.400e-10 593- 
608 PR00320B 12.19 
3.250e-09 593-608 


675 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-587 PR00320B 
12.19 4.115e-12 614- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 










629 PR00320C 13.01 
8.435e-ll 696-711 
PR00320C 13 01 7 Rnn»- 
10 614-629 PR00320C 
13.01 6.400e-10 572- 
587 PR00320B 12.19 
3.250e-09 572-587 


676 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.667e- 
09 249-263 


679 


PFO0642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 3.7Q0e- 
16 225-236 PF00642 
11.59 7.900e-12 187- 
198 


680 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 8.754e- 
10 286-296 


681 


BL00019 


Actinin-type actin- 
binding domain proteins. 


BL00019D 15.33 4.200e- 
19 227-257 


682 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 4.000e- 
09 99-118 


687 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.500e- 
10 538-553 


689 


BL01024 


Protein phosphatase 2A 
regulatory subunit PR5S 
proteins . 


BL01024A 10.26 l.OOOe- 
40 22-69 BL01024B 
8.91 l.OOOe-40 86-127 
BL01024C 7.80 l.OOOe- 
40 14o-lo5 BJj01024D 
13.22 l.OOOe-40 185- 

DllUlUZ'itl 11 . 7D 

1.000e-40 222-266 

40 266-317 BL01024G 
11 09 1 000e-40 317- 
349 BL01024H 13.88 
l.OOOe-40 389-442 


691 


BL00027 


' Homeobox ' domain 
proteins . 


BL00027 26.43 8.071e- 
31 152-195 


692 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-57 


693 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-57 


694 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 58-70 


696 




Up 4- y% "i on ins* 

aminopeptidase subfamily 

1 T35TOfcr»iriS 


■RT.OfJfiftn 14 17 ^ •jr\yi*>- 

JDIlUUOOv X"± . J / 3 . JU'tC 

17 173-195 


697 


BLO0741 


Guanine - nuc 1 eo t i de 
dissociation stimulators 
CDC 2 4 family sign. 


BL00741B 14.27 3.418e- 
11 242-265 


698 


DM01930 


2 kw FINGER SMCX SMCY 
YDR096W. 


DM01930E 15.41 1.367e- 
37 170-215 DM01930F 
14.16 8.232e-28 267- 

JUJ l/rlUl?J UD 17. OO 

9.163e-10 37-71 


700 


PR00869 


DNA- POLYMERAS E FAMILY X 
SIGNATURE 


PR00869A 12.80 1.281e- 
16 245-263 


701 


PR0004 8 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.174e- 
10 77-91 PR00048A j 
10 52 6 870e-10 133- 
147 PR00048A 10.52 
8.826e-10 105-119 
PR00048A 10.52 5.320e- 
09 161-175 


702 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 2.565e- 
25 326-356 BL00523A 
13.36 5.050e-16 38-55 
BL00523B 8.64 5.909e- 
15 86-98 BL00523C 
12.64 5.500e-13 137- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








l.B44e-ll 290-302 

DiiVi VJ «J -J> VJP J • *a v -J . J U U tZ3 

10 513-523 BL00523F 
10 85 6 351e-09 413- 
424 


703 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.412e- 
12 376-390 PR00048B 
6.02 1.000e-10 334-344 
PR00048B 6.02 1.474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


14 66-82 


708 


PR00761 


BINDIN PRECURSOR 
SIGNATURE 


PR00761E 14.32 8.500e- 
10 822-841 


712 


DM01354 


kw TRANSCRIPTASE REVERSE 
II 0RF2. 


DM01354Y 10.69 4.977e- 
38 425-465 DM01354X 
13.86 7.300e-34 376- 

^13 UrlUlJD^V 12.97 

4.923e-17 311-358 
DM01354W 12.64 5.596e- 

JL U J3D-J ID 


713 


BL00039 


DEAD -box subfamily ATP- 

dpT^prifipnfe HaI i rAflpfl 

proteins. 


BL00039D 21.67 7.545e- 

18.44 2.537e-18 147- 
186 BL00039C 15.63 
2.216e-14 280-304 
BL00039B 19.19 1.947e- 


715 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins . 


BL00383E 10.35 4.981e- 
10 150-161 


717 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 4.035e- 
21 106-161 


718 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 2.688e-28 84-118 
DM00031C 12.79 1.300e- 
12 131-142 


719 


BL00243 


Integrins beta chain 
cysteine-rich domain 
proteins. 


BL00243B 17.54 l.OOOe- 
40 131-172 BL00243C 
16.42 1.000e-40 172- 
208 BL00243D 24.07 
1.000e-40 222-274 
BL00243F 22.63 l.OOOe- 
40 314-358 BL00243I 
31.77 6.571e-39 607- 
650 BL00243E 16.70 
3.077e-35 274-304 . 
BL00243G 21.38 3.625e- 
3^ 358-400 BL00243H 
17.53 5.235e-29 567- 

cq*} tit nni^u i »7 
s^J DbUUz^JA 17. ol 

3.250e-21 63-84 
BL00243H 17.53 7.167e- 
16 477-503 BL00243H 
17.53 2.304e-ll 524- 
550 BL00243H 17.53 
5.304e-ll 606-632 
BL00243I 31.77 1.380e- 
09 610-653 i 


720 


PR00217 


43 KD POSTSYNAPTIC 
PROTEIN SIGNATURE 


PRO0217C 10.91 8.022e- "" 
09 20-36 


722 


PR00704 


CALPAIN CYSTEINE 
PROTEASE (C2) FAMILY 
SIGNATURE 


PR00704D 11.05 5.909e- 
34 135-161 PR00704F 
13.61 7.000e-26 190- 
218 PR00704E 12.55 
8.071e-26 165-189 
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SEQ ID NO: 


ACCESSION 
WO. 


DESCRIPTION 


RESULTS* 








PR00704B 17.94 2.241e- 
23 75-98 PR00704A 
14.68 4.094e-19 30-54 
PR00704C 11.88 1.871e- 
18 99-116 


725 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


724 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


727 


PR00320 


G-PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 2.125e~ 
13 277-292 PR00320A 
16.74 1.310e-ll 277- 
292 PR00320C 13.01 
4.522e-ll 323-338 
PR00320A 16.74 6.586e- 
11 323-338 PR00320B 
12.19 4.343e-10 323- 
338 PR00320B 12.19 
6.914e-10 277-292 


731 


PR0019S 


DYNAMIN SIGNATURE 


PR00195A 11.94 8.627e- 
16 288-307 PR00195E 
9,82 3.912e-ll 457-474 


733 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.082e- 
10 787-798 


738 


BL00039 


DEAD-box subfamily ATP- 
dependent helicases 
proteins. 


BL00039A IB 44 2 5SSe- 
28 26-65 BL00039D 
21.67 2.105e-20 338- 
384 BL00039C 15.63 
9.100e-13 160-184 
BL00039B 19.19 9.617e- 
11 73-99 


739 


BL01289 


TSC-22 / dip / bun 
family proteins. 


BL01289A 12.18 8.909e- 
31 326-353 BL01289B 
10.45 9.571e-17 353- 
383 


742 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 7.07Be- 
12 41-81 


743 


BL00965 


Phosphomannose isomerase 
type I proteins. 


BL00965C 23.78 l.OOOe- 
40 256-305 BL00965B 
17.77 1.600e-25 126- 
153 BL00965A 10.57 
6.400e-19 94-113 


747 


BL00021 


Kringle domain proteins. 


BL00021D 24.56 4.563e- 
25 231-273 BL00021B 
13.33 5.345e-21 60-78 


748 


BL00*12 


Osteonectin domain 
proteins . 


BL00612B 11.35 2.034e- 
11 93-126 


749 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 6.880e- 
10 135-157 


752 


BL00795 


Involucrin proteins . 


BL00795C 17.06 6.000e- 
11 384-429 BL00795C 
17.06 9.444e-ll 370- 
415 


754 


BL00051 


Ribosomal protein L3 9e 
proteins . 


BL00051 20.92 1.935e- 
16 4-50 


755 


DM01970 


0 Jew ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 7.723e- 
09 171-184 


760 


BL01020 


SARI family proteins. 


BL01020C 15.35 9.020e- 
12 99-150 


762 


BL00046 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 33-B8 


7*3 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 9.137e- 
10 206-240 


764 


BL00027 


•Homeobox' domain 
proteins. 


BL00027 26.43 8.800e- 
29 417-460 


767 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 6.063e- 
10 309-324 BL01208B 
15.83 8.031e-10 165- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








180 BL012083 15.83 
4.162e-09 85-100 


770 


BLO0031 


Nuclear hormones 
receptors DNA-binding 
region proteins. 


BL00031A 19.55 9.571e- 
32 -208-241 BL00031B 
22.25 5.500e-27 242- 
274 


772 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.450e- 
18 4-26 PR00449E 
13.50 3.520e-14 142- 
165 PR00449C 17.27 
3.032e-13 44-67 
PR00449D 10.79 8.579e- 
13 107-121 PR00449B 
14.34 3.455e-ll 27-44 


773 


"BL00523 


Sulfatases proteins. 


BL00523E 19.27 9.333e- 
23 299-329 BL00523A 
13.36 2.200e-13 47-64 
BL00523B 8.64 2.607e- 
13 91-103 BL00523D 
9.89 7.923e-12 224-236 
BL00523C 12.64 4.512e- 
10 141-152 BL00523F 
10.85 5.821e-10 373- 
384 ; 


775 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 56B-585 


776 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 621-638 


777 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 595-612 


778 


BL0 0030 


EuJcaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 8.412e- 
11 322-341 BL00030A 

239 


779 


PR00079 


GLUCOSE - 6 - PHOS PHATE 
DEHYDROGENASE SIGNATURE 


PR00079B 12.98 2.929e- 
26 193-222 PR00079E 
16.65 4.150e-23 348- 
375 PR00079C 8.68 
6.351e-16 246-264 
PR00079D 13.51 7.070e- 
16 264-281 PR00079A 
16.12 6.769e-13 169- 
183 


781 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 9.250e- 
17 10-35 BL00215A 
15.82 6.000e-16 221- 
246 BL00215A 15.82 
7.857e-12 108-133 
BL00215B 10.44 9.526e- 
11 168-181 


783 


PD002B9 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 6.276e-09 
159-173 


785 


BL00690 


DEAH-box subfamily ATP- 
dependent helicases 
proteins. 


BL00690B 13.38 l.OOOe- 
12 147-165 BL00690A 
6.87 5.320e-10 114-124' 
BL00690C 7.51 3.189e- 
09 218-228 


786 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 8.500e- 
16 50-73 PR00449A 
13.20 5.235e-14 8-30 
PR00449E 13.50 2.853e- 
11 150-173 PR00449D 
10.79 1.545e-09 111- 
125 


788 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 8.767e- 
10 1-21 


790 


HLQ0915 


Phosphatidylinositol 3- 
and 4 -kinases proteins. 


BL00915C 22.43 9.182e- 
39 725-764 BL00915B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








22.78 5.050e-33 633- 
671 BL00915D 27.02 
1.529e-21 795-831 
BL00915A 10.09 l.OOOe- 
13 395-407 


791 


PR00208 


GLIADIN AND LMW GLUT EN IN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 6.294e- 
10 120-138 PR00208A 
12.59 6.294e-10 121- i 
139 PR00208A 12.59 
6.294e-10 122-140 
PR00208A 12.59 6.294e- 
10 123-141 PR00208A 
12.59 6.294e-10 124- 
142 PR00208A 12.59 
6.294e-10 125-143 
PR00208A 12.59 6.294e- 
10 126-144 PR00208A 
12.59 6.294e-10 127- 
145 PR00208A 12.59 

/- i ft inn i « /• 

6.294e-10 128-146 
PR00208A 12.59 6.294e- 
10 129-147 PR00208A 
12.59 7.411e-09 130- 
148 PR00208A 12.59 

PR00208A 12.59 7.904e- 
09 132-150 PR00208A 
12.59 8.274e-09 118- 
136 PR00208A 12.59 
8.274e-09 119-137 


795 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 5.034e- 
16 302-320 PR00205A 
14.73 1.257e-ll 284- 
300 PR00205C 13.65 


796 

• 


BL00412 


Neuromodulin (GAP -43) 
pioiems . 


BL00412D 16.54 4.000e~ 

T"5 T Q C — *5 A *7 TIT I\C\A 1 On 

16.54 5.705e-ll 197- 
248 BL00412D 16.54 
7.848e-10 199-250 
BL00412D 16.54 1.827e- 
09 195-246 BL00412D 
16.54 1.918e-09 194- 

£43 rSJ-iU U *± ±ZLJ lb . 3? 

2.102e-09 201-252 


797 


BL00021 




13 40-58 


799 


BL01052 


Ca loon in f ami 1 v rpnpat" 

proteins . 


DUU1U -J <£ V- lu ■ JX X . UvUw 

40 87-127 BL01052A 
16.12 1.529e-32 3-35 
BL01052B 15.31 1.257e- 
25 52-78 BL01052D 
10.26 5.737e-25 174- 
194 


800 


r"BL00348 


p53 tumor antigen 
proteins . 


BL00348F 23.19 3.714e- 
09 197-240 


801 


BL00309 


Vertebrate galactoside- 
binding lectin proteins . 


BL00309C 18.65 1.621e- 
09 62-87 


802 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245D 10.47 5.224e- 
09 187-199 


804 


PF00774 


D i hydropy r i dine 
sensitive L-type calcium 
channel (Beta subuni. 


PF00774A 16.47 8.457e- 
10 110-156 


808 


PR00667 


RETINAL PIGMENT 
EPITHELIUM -RETINAL GPCR 
SIGNATURE 


PR00667C 11.71 9.875e- 
09 12-28 


810 


PD02346 


PHOTOS YSTEM II PROTEIN 
PRECURSOR 


PD02346F 12.89 4.340e- 
09 317-354 
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SEQ ID NO: 


ACCESSION 
NO, 


DESCRIPTION 


" RESULTS* 






PHOTOSYNTHESIS . 




811 


BL00585 


CBF-A/NF-YB subunit 
proteins . 


BL00685B 14.41 6 . 779e- 
14 54-95 BL00685A 
11.22 4.793e-l3 5-54 


812 


PROOOBO 


ALCOHOL DEHYDROGENASE 
SUPERFAMILY SIGNATURE 


PR00080A 9.32 9.419e- 
10 93-105 


813 


BL00357 


Histone H2B proteins. 


BL00357 7.74 1.988e-17 
22-65 


815 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 7.923e- 
15 158-171 PD00066 
13.92 5.200e-14 46-59 
PD00066 13.92 7.000e- 
•14 18-31 PD00066 
13.92 7.000e-l3 130- 
143 PD00066 13.92 
7.500e-13 214-227 
PD00066 13.92 9.000e- 
13 102-115 PD00066 
13.92 4.429e-12 186- 
199 PD00066 13.92 
1.783e-ll 74-87 


B16 


BL01195 


Peptidyl-tRNA hydrolase 
proteins. 


BL01195C 20.12 3.348e- 
20 100-139 


820 


BL00520 


Interleukm-10 family 
proteins. 


BL00520A 6.21 6.471e- 
09 1-14 


822 


BL00972 


Ubiquitin carboxyl- 
tertninal hydrolases 
family 2 proteins. 


BL00972A 11.93 8.113e- 
09 224-242 


825 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 2.268e- 
10 101-115 


829 


PD02855 


FLAVOPROTE IN PROTEIN 
DNA/PANTOTHEN. 


PD02855A 18.37 4.732e- 

28 88-124 PD02855B 

8 36 6 478e-09 132-14? 


830 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 7.000e- 
21 44-62 PR00405C 
19.41 1.000e-13 65-87 
PR00405A 17.71 7.283e- 
13 25-45 


831 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61 PR00019B 
11.36 1.720e-09 136- 
150 PR00019B 11.36 
3.880e-09 44-58 


832 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllB 13.08 3.438e- 
16 164-183 PROOOllD 
14.03 6.850e-16 164- 
183 PROOOllA 14.06 
8.364e-14 164-183 
PROOOllC 24.25 5.415e- 
12 231-260 PROOOllD 
14.03 9.852e-ll 212- 
231 


834 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 232-246 


835 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 4.000e- 
10 290-304 


836 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 216-230 


837 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 3.89Be- 
09 78-111 


839 


PD02784 


PROTEIN NUCLEAR 
RIBONUCLEOPROTEIN . 


PD02784B 26.46 8.302e- 
09 73-116 


840 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5.091e- 
22 369-390 PR00700D 
12.47 5.765e-21 491- 
510 PR00700C 13.17 
4.750e-14 449-467 
PR00700F 11.18 8.500e- 
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SEO ID NO- 


NO. 


1 DESCRIPTION 










17.57 3.100e-lQ 522- 
538 


841 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 5.404e- 
13 134-153 


844 




PROTEIN RIBOSOMAL 60S 
L22 RNA- BINDING HEP . 


40 58-112 PD02785A 
15.23 1.915e-28 8-57 


845 


BL00826 


MARCKS family protein9. 


BL00826C 7.63 6.738e- 
09 203-230 


846 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.429e- 
10 15-24 


B49 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
08 340-349 


B50 


PROO308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 6.506e- 
09 12-27 


851 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 7.000e- 
16 246-280 


852 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420B 22.67 l.OOOe- 
40 723-778 BL00420B 
22.67 1.321e-38 933- 
988 BLO0420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 587-642 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-l5 830-885 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 808- 
819 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 5.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 


853 


BL00420 


Speract receptor repeat 
proteins domain 
proteins. 


BL00420B 22.67 l.OOOe- 
40 756-811 BL00420B 
22.67 1.321e-38 966- 
1021 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 620-675 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 

2.800e-15 863-918 
3L00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 841- 
852 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.83le- 
11 141-152 BL00420C 
11.90 5.119e-ll 1051- 
1062 BL00420C 11.90 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








7.955e-10 567-578 


857 


PR00388 


3 1 , 5 ■ -CYCLIC NUCLEOTIDE 
CLASS II 

PHOSPHODIESTERASE 
SIGNATURE 


PR00388A 10.45 2.778e- 
09 64-83 


859 


BL00030 


EuJcaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 2.929e- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 
BL00030A 14.39 2.000e- 
10 128-147 


861 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.250e- 
17 23-41 PR00988C 
13.64 8.714e-16 107- 
123 PR00988F 12.23 
7.828e-15 198-212 
PR00988E 8.27 9.7S9e- 
12 176-188 PR00988D 
5.95 8.250e-ll 163-174 
PR00988B 11.60 4.512e- 
10 60-72 


863 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215B 10.44 8.071e- 
12 41-54 


864 


PR00775 


90 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00775E 8.06 l.OOOe- 
24 198-221 PR00775B 
3.52 1.837e-23 107-130 
PR00775D 8.91 4.484e- 
17 171-189 PR00775A 
9.90 8.342e-17 86-107 
PR00775C 10.68 9.379e- 
17 153-171 PR00775G 
10.64 6 .850e-15 267- 
286 PR00775F 12.76 
6.769e-14 249-267 


866 


DM01688 


"2 POLY-IG RECEPTOR. 


DM01688G 16.45 9.460e- 
09 89-121 


867 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 S.596e- 
29 14-53 


868 


BL01287 


RNA 3 '-terminal 
phosphate cyclase 
proteins. 


BL01287A 17.95 2.£8Be- 
26 16-48 


369 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.464e- 
10 304-337 


872 


BL00046 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 30-85 


874 


BL00188 


Biot in- requiring enzymes 
attachment site 
proteins . 


BL00188 30.29 9.036e- 
32 6S5-711 


876 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 298-315 


877 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 16.74 4.176e- 
10 97-141 


879 


BL01189 


Ribosomal protein S12e 
proteins. 


BL01189A 14.27 l.OOOe- 
40 35-71 BL01189B 
13.49 1.000e-40 71-125 


882 


BL00284 


Serpins proteins. 


BL00284C 28.56 6.400e- 
25 62-104 BL00284B 
17.99 6.182e-12 35-56 


869 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.375e- 
21 35-85 


896 


PR00391 


PHOSPHATIDYLINOSITOL 
TRANSFER PROTEIN 
SIGNATURE 


PR00391E 12.50 7.785e- 
15 211-231 PR00391B 
8.39 1.000e-13 83-104 
PR00391D 12.21 9.328e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 


897 


PR00327 


"ICE NUCLEATIOtf PROTEIN 


PR00327C 6.37 5.247e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


" RESULTS* 






SIGNATURE 


09 313-328 


698 


BL00039 


DEAD -box subfamily ATP- 
dependent he 1 leases 
proteins . 


BL00039D 21.67 7.800e- 
26 386-432 BL00039A 
18.44 6.674e-16 113- 
152 BL00039B 19.19 
1.947e-13 153-179 
BL00039C 15.63 9.460e- 

11 ZJD SOU 


901 


PD00066 


PROTEIN ZINC-FINGER 
METAL- BIND I . 


PD00066 13.92 8.200e- 

ID £.O^t 1 IrUVJUUOO 

13.92 8.200e-16 282- 
295 PD00066 13.92 
8.200e-16 310-323 
PD00066 13.92 8.200e- 
16 366-379 PD00066 
13.92 8.200e-16 394- 
407 PD00066 13.92 
8.200e-14 33B-351 


902 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 9.321e- 
11 6-50 


903 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 9.160e- 
09 97-111 


904 


PR00381 


KINESIN LIGHT CHAIN 
SIGNATURE 


PR003B1E 8.75 6.586e- 
25 335-356 PR00381B 
18.17 2.667e-24 204- 
224 PR00381A 9.55 
2.800e-24 107-125 
PR00381C 12.48 4 . 522e- 
24 226-245 PR00381D 
13.94 1.084e-22 291- ; 
309 PR00381F 9.13 
3.288e-22 370-392 
PR00381F 9.13 7.18le- 

T1 ooc ■jno nn n m o t rr> 
JLJ zob-JUo FKOOjdIL 

8.75 4.066e-ll 251-272 

rKUUJOlti O* /3 / . Uj JS" 

11 293-314 PR00381E 

PR00381D 13.94 5.230e- 
09 333-351 PR00381C 
12. 4B 7.120e-09 310- 
329 


906 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.54 8.557e- 
09 525-549 


907 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.54 8.557e- 
09 513-537 


908 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 9.308e-ll 
144-155 


910 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.800e- 
30 48-87 


912 


BL01104 


Ribosomal protein L13e 
proteins. 


BL01104C 15.14 6.000e- 
09 364-392 


922 


BL0Q678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 3.842e-09 
500-511 


923 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 2.500e- 
09 323-338 PRno^'JOP 
13.01 5.500e-09 187- 
202 


924 


PD02181 


PROTOCHLOROPHYLLIDE 
REDUCTASE PHOTOS YNT . 


PD02181D 12.85 8.609e- 
09 36-64 


926 


BL00019 


Actinin-type actin- 
binding domain proteins. 


BL00019C 14.65 7.453e- 
25 108-144 BL00019B 
13.34 6.510e-ll 61-84 
BL00019D 15.33 9.338e~ 
11 205-235 BL00019A 
12.56 2.373e-10 34-45 


928 


BL00678 


Trp-Asp (WD) repeat 


BLO0478 9.^7 9.308e-ll 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


273-284 BL00678 9.67 
1.600e-10 314-325 
BL00678 9.67 7.600e-10 
360-371 BL00678 9.67 
8.579e-09 206-217 


929 


BL00518 


Zinc finger, C3HC4 type 
(RING f inger) , proteins. 


BL00518 12.23 1.857e- 
10 137-146 


930 


BL01085 


Ribu lose -phosphate 3- 
epimerase family- 
proteins. 


BL01085D 16.55 4.600e- 
24 134-165 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 172-202 BL01085C 
21.81 2.038e-14 66-97 


931 


BL01085 


Ribulose -phosphate 3- 
epiraerase family- 
proteins. 


BL01085D 16.55 4.600e- 
24 152-183 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 190-220 BL01085C 
21.81 2.038e-14 66-97 


933 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301A 10.24 6.400e- 
09 160-171 


936 


PFD0168 


C2 domain proteins. 


PF00168C 27.49 4.000e- 
12 336-362 


937 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.519e- 
10 5-49 


940 


PR00862 


PROLYL OLIGOPEPTIDASE 
SERINE PROTEASE (S9A) 
SIGNATURE 


PR00862D 16.17 4.086e- 
09 63-84 


945 


BL01230 


RNA methyl transferase 
trmA family proteins. 


BL01230B 11.62 2.373e- 
09 407-420 


948 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479B 12.57 7.429e- 
18 52-68 BL00479A 
19.86 2.200e-13 26-49 


949 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 1.474e-09 
100-111 


954 


PD01311 


PROTEIN OXIDOREDUCTASE 
NAD INTERGENIC RE. 


PD01311A 30.23 5.909e- 
10 66-111 


955 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


956 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


957 


BL00379 


CDP- alcohol 

phosphatidyl transferases 
proteins . 


BL00379 24.64 1.610e- 
15 111-148 


959 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 1.884e~ 
10 31-75 


960 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.438e~ 
14 110-154 


962 


BL00061 


Short -chain 

dehydrogenases/reductase 
s family proteins. 


BL00061B 25.79 6.586e- 
13 198-236 


963 


PR00502 


MUTT DOMAIN SIGNATURE 


PR00502A 15.06 8.200e- 
11 210-225 


966 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5,90 7.035e- 
09 55-70 


9^7 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 1.286e- 
12 104-124 DM01206B 
10.69 5.299e-ll 23-43 
DM01206B 10.69 8.274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
128 DM01206B 10.69 
5.671e-09 38-58 


969 


PF01008 


Initiation factor 2 
subunit . 


PF01008B 25.59 4.724e- 
31 417-460 PF01008C 
12.25 5.333e-18 506- 
526 PF01008A 20.14 
5.875e-15 369-390 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


970 


BL01277 


Ribonuclease PH 
proteins . 


BL01277C 10.18 7.648e- 
10 112-143 BL01277A 
17.39 9.806e-10 40-78 


975 


BL01159 


WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 3.605e- 
12 130-145 BL01159 
13.85 4.122e-10 171- 
186 


977 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791C 20.98 2.235e- 
09 55-94 


978 


BL01167 


Rihosomal protein L17 
proteins . 


BL01167B 20. 6£ 8.258e~ 
19 88-127 


979 


BL00478 


LIM domain proteins. 


BL00478B 14.79 9.357e- 
13 33-48 BL00478B 
14.79 7.250e-12 98-113 


980 


PRO 03 12 


CALSEQUESTRIN SIGNATURE 


PRO0312E 8.32 3.423e- 
36 169-199 PR00312I 
15.78 5.286e-35 332- 
361 PR00312F 15.06 
5.865e-35 199-229 

35 263-291 PR00312J 

11 71 5 fiRflp-ld 

_L j . / J 3 .DOOC - J «i JDJ — 

392 PR00312D 9.43 
2 636e-31 l?B-15fl 
PR00312C 15.14 8.839e- 
33 92-122 PR00312B 
15.08 8.941e-33 62-92 
PR00312G 11 11 fic;7e»- 
32 230-258 PR00312A 
11.70 6.914e-27 35-59 


981 


PP00992 


Troponin . 


PF00992A 16.67 8 . 816e- 
09 414-449 


982 


PR00299 


ALPHA CRYSTALLIN 
SIGNATURE 


PR00299F 13.20 2.367e- 
09 127-149 


983 


BL01150 


Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 


BL01150B 17.16 l.OOOe- 
40 156-202 BL01150A- 
14.10 8.200e-39 100- 
138 


986 


BL00795 


TTlvoltifr"? n nrnhpi n«i 


DilUU /73V. XI. V/D / . ZJLJLe — 

14 4-49 BL00795C 

BL00795C 17.06 3.407e- 
10 14-59 BL00795C 
17.06 7.802e-10 2-47 
BL00795C 17.06 8.640e- 
10 19-64 BL00795C 
17.06 7.400e-09 11-56 
BL00795C 17.06 7.800e- 
09 3-48 


987 


BL00939 


Ribosomal protein Lie 
proteins. 


BL00939F 17.27 5.393e- 
09 810-840 


988 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 525-541 


989 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 497-513 | 


994 


BL00027 


' Horaeobox 1 domain 
proteins . 


BL00027 26.43 2.500e- 
25 146-189 


997 


BL013 04 


ubiH/COQ6 monooxygenase 
family proteins. 


BL01304A 8.05 3.893e- 
11 65-79 


998 


DM01767 


5 TRANSMITTER DOMAIN. 


DM01767B 10.07 7.868e- j 
09 22-39 


1000 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926C 16.07 1.750e- 
24 73-94 PR00926D 
10.53 3.2506-23 126- 
145 PR00926F 17.75 
6.211e-23 217-240 
PR00926E 11.70 6.625e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 174-193 PR00926B 








16.07 2.125e-18 24-39 








PR00926A 10.41 l.OOOe- 








15 11-25 PR00926F 








17.75 5.565e-09 120- 








143 


1005 


BL00406 


Actins proteins. 


BLD0406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406D 12.58 3.700e- 
40 270-325 BL00406E 
8.44 7.375e-38 327-377 
BL00406A 9.95 3.348e- 
29 11-46 


1006 


BL00406 


Actins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406E 8.44 l.OOOe- 
35 248-298 BL00406A 
9.95 3.348B-29 11-46 


1007 


PR00304 


TAILLESS COMPLEX 
POLYPEPTIDE 1 
(CHAPERONE) SIGNATURE 


PR00304D 11.04 8.714e- 
22 384-407 PR00304C 
8.69 4.667e-20 98-118 
PR00304B 11.60 7.577e- 
19 68-87 PR00304A 
9.20 3.382e-16 46-63 
PR00304E 7.79 6.670e- 
13 418-431 


1009 


PD01066 


PROTEIN ZINC FINGER 
2 INC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.929e- 
32 9-48 


1011 


PD01066 


PROTEIN ZINC FINGER 
2 INC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.929e- 
32 68-107 


1012 


BL0051B 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 6.143e- 
10 64-73 


1016 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168H 12.08 l.OOOe- 
11 174-194 


1018 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 1.391e- 
32 261-302 PD00930A 
25.62 9.550e-22 157- 
183 


1022 


BL00175 


Phosphoglycerate mutase 
family phosphohistidine 
proteins . 


BL00175A 15.42 5.179e- 

12 6-26 BL00175C 

23 .75' 8.062e-10 79-111 


1025 


PR00305 


14-3-3 PROTEIN ZETA 
SIGNATURE 


PR00305D 16.34 1.439e- 
10 158-185 


1026 


BL00353 


HMG1/2 proteins. 


BL00353B 11.47 2.436e- 
18 238-288 BL00353C 
14.83 8.844e-ll 288- 
335 


1028 


BL00183 


Ubi qu i t in- con} uga t i ng 
enzymes proteins. 


BL00183 28.97 1.310e- 
33 43-91 


1033 


PP00580 


UvrD/REP helicase. 


PF00580A 13.37 4.720e- 
09 111-133 


1034 


PR00413 


HALOACID 

DEHALOGENASE/ EPOXIDE 
HYDROLASE FAMILY 
SIGNATURE 


PR00413E 15.78 3.429e- 
09 154-171 


1037 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.657e- 
09 5-44 


1038 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 4.259e- 
11 55-82 


1039 


BL00299 


Ubiquitin domain 
proteins. 


BL00299 28.84 9.036e- 
09 17-69 


1040 


PROO970 


ARGININE ADP- 
RIBOSYLTRANSFERASE 


PR00970A 17.73 6.143e- 
20 56-78 PR00970D 
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SEQ ID NO : 


ACCESSION 
NO. 




KliSULilb* 








y . yo z.ib^e-io io4-i/i 
PR00970F 12.30 l.OOOe- 
16 224-241 PR00970G 
9.97 9.229e-l5 242-258 
PR0,0°7flR 16 "i7 1 290e- 
13 86-105 PR00970C 
11.05 1.643e-ll 115- 
130 PR00970E 11.23 
9 820e-ll 202-218 


2042 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 2.200e-10 
243-254 


1043 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 6.786e- 
13 114-128 PR00048A 
10 ^2 1 000e-09 172- 
186 


1045 


DDUUOl J 


proteins. 


BIjOQ615A 16 68 1 720e- 
11 218-236 BL00615B 
12.25 1.857e-10 317- 
331 


1046 




nuci ly x <x l- c v<y^j.aoco 

class- I proteins. 


OJJUJ.U J. J « J ^ U . -'"i 

10 3-40 


1047 


BL01216 


ATP-citrate lyase / 
family proteins . 


BL01216D 21.75 4.316e- 
2fi 31d-3d4 Bli012l6A 
13.91 1.000e-10 97-112 


1049 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 7.618e~ 
12 102-136 


1050 


BL01073 


Ribosomal protein L24e 
proteins . 


BL01073 24.30 l.OOOe- 
/in to co 


1054 


BL00571 


Amidases proteins. 


BL00571 25.^9 5.875e- 
ii icn oio 


1055 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 5.235e- 
1 1 q q in dt nnniriTJ 

IX 30-11/ QliUUUjUD 

7.03 4.316e-09 137-147 


1058 


BL00223 


Annexins repeat proteins 
domain proteins . 


BL00223C 24.79 8.754e- 
oi oeo-117 RT.nn92iZk 

15.59 9.478e-14 46-80 
BL00223A 15.59 5.557e- 
11 118-152 


1060 


BL00027 


1 Homeobox 1 domain 
proteins. 


BL00027 26.43 3.455e- 
35 15B-201 


1064 


BL00455 


Putative AMP-binding 
domain proteins . 


BL00455 13.31 6.211e- 

11 OQn_OQC 


1065 


PR00019 


LEUCINE-RICH REPEAT 

O T/5M71TI 1UV 


PR00019A 11.19 2.000e- 

VJ7 113-16J lrKUUUi?13 

11.36 3.8B0e-09 87-101 


1066 


rKUU b 


laltrl/Unlj vallr-D lHUXiNu 

PROTEIN FAMILY SIGNATURE 


DDnmo ca d o** a cnno- 1 

rKUuJZDA o. /3 t.dUUc- 

16 151-172 PR00326C 
9.79 1.290e-14- 200-216 
PR00326B 16.74 8.548e- 
14 172-191 PR00326D 
19 0° 1 257e»-13 217- 

X ? . U 3 X • £m 1 C. X J X X / 

236 


1071 


PD02870 


RECEPTOR INTERLEUKIN-1 


PD02870B 18.83 8.518e- 

11 1 fid - 1 Q7 

XX 1DS"13 / 


1072 


PF00856 


SET domain proteins. 


PF00856A 26.14 5.976e- 
09 350-387 


1075 


OT rtl Art Q 


Extracellular proteins 

SCP /Tnx- 1 /Aa5 /PR- 1 /Sc7 

proteins . 


oLOlOOyD 14.1? 4.juue- 
20 127-148 BL01009A 
13.75 6.586e-13 57-75 
BL01009E 13.50 1.439e- 
11 159-175 


1077 


PR00724 


CARBOXY PEPTIDASE C 
SERINE PROTEASE (S10) 
FAMILY SIGNATURE 


PR00724A 10.91 l.OOOe- 
08 366-379 


1076 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 l.OOOe- 
12 170-195 BL00215A 
15.82 7.529e-10 79-104 


1079 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 4.3l6e-09 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


298-309 


1081 


BL00326 


Tropomyosins proteins . 


BL00326A 14.01 7.398e- 
10 23-57 


1094 


BL00460 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A 28.67 3.204e- 
18 57-92 BL00460B 
9.73 6.400e-13 100-118 
BL00460D 16.89 9 . 143e- 
12 162-182 BL00460C 
14.35 5.500e-09 133- 
156 


1095 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 67-105 PD02811B 
17.07 2.263e-21 118- 
151 PD02811C 13.25 
5.696e-13 154-167 


1096 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 60-98 PD02811B 
17.07 2.263e-21 111- 
144 PD02811C 13.25 
5.696e-13 147-160 


1097 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479B 12.57 6.143e- 
09 200-216 


1105 


PF00881 


Nitroreductase family. 


PF00881A 27.15 9.229e- 
13 111-147 


1109 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.077e- 
10 15-37 PR00449F 
13.50 l,857e-09 185- 
208 PR00449D 10.79 
8.364e-09 131-145 


1115 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.737e- 
20 42-60 PR00405A 
17.71 2.703e-17 23-43 
PR00405C 19.41 6.902e- 
10 63-85 


1116 


BL00355 


HMG14 and HMG17 ' " 
proteins. 


BL0035S 5.97 2.528e-25 
20-51 


1117 


BL00355 


HMG14 and HMG17 
proteins. 


BL00355 5.97 2.528e-25 
20-51 


1120 


BL0O107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13 31 4 8S7e- 
10 290-306 


1123 


PR00412 


EPOXIDE HYDROLASE 
SIGNATURE 


PR00412F 18.7* 9.526e- 
12 301-324 


1125 


PR00186 


HEMERYTHRIN SIGNATURE 


PR00186A 13.62 2.800e- 
09 87-101 


1129 


BL00170 


Cyclophilin-type 
pept idyl -prolyl cis- 
trans isomerase 
signatur . 


BL00170C 18.49 3.077e- 
33 84-129 BL00170B 
20.97 6.838e-25 37-77 
BLO017OA 17.08 3.455e- 
15 10-37 


1131 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 5.304e- 
15 29-46 BL00636B 
15.11 1.360e-14 59-80 


1132 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1133 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1136 


BL00990 


Clathrin adaptor 
complexes medium chain 
proteins . 


BL00990C 18.78 4.176e- 
38 235-269 BL00990A 
21.44 4.316e-36 94-132 
BL00990B 20.15 2.125e- 
27 157-187 BL00990D 
16.13 5.320e-18 403- 
422 


1137 


PRO 03 14 


CLATHRIN COAT ASSEMBLY 
PROTEIN SIGNATURE 


PR00314B 15.68 S.OOOe- 
34 100-128 PR00314D 
9.66 3.531e-33 233-261 
PR00314C 16.05 8.909e~ 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








32 159-188 PR00314A 
14.53 1.281e-22 13-34 


1139 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 6.364e- 
13 13-57 


1141 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
19 451-4B2 BL00107B 
13.31 3.077e-12 519- 
535 


1148 


PR00685 


TRANSCRIPTION INITIATION 
FACTOR I IB SIGNATURE 


PR00685A 13.62 4.676e- 
09 21-42 


1155 


PD01652 


RECEPTOR CELL NK 
GLYCOPROTEIN IMMUNOGLOB . 


PD01652B 8.50 9.396e- 
10 522-574 PD01652B 
8.50 9.463e-10 740-792 


1157 


PD02894 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894A 21.96 7.873e- 
28 81-127 PD02894B 
13.93 1.188e-27 178- 
211 


1159 


BL00623 


GMC oxidoreductases 
proteins . 


BL00623E 15.00 3.531e- 
20 391-414 BL00623C 
10.86 4.240e-20 155- 
176 


1161 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- . 


PD01937A 6.68 3.475e- 
09 330-341 


1162 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- . 


PD01937A 6.68 3.475e- '™ 
09 221-232 


1163 


PR00624 


HI STONE H5 SIGNATURE 


PR0O624D 11.94 7 . 455e- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 
337 


1167 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 7.384e- 
09 302-350 


1177 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032G 8.33 1.422e- 
10 34-48 


1178 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 1.794e- 
10 205-220 PR00320C 
13.01 7.840e-10 205- 
220 PR00320B 12.19 . 
8.457e-10 35-50 
PR00320A 16.74 7.146e- 
09 35-50 PR00320B 
12.19 9.100e-09 79-94 


1180 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454D 10.89 4.lS0e- 
19 765-784 


1181 


BL00291 


Prion protein. 


BL00291A 4.49 8.962e- 
11 152-187 


1184 


BL00720 


Guanine -nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 4.1036- 
18 1089-1113 


1185 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4.553e- 
13 204-229 BL00215A 
15.82 1.429e-12 11-36 
BL00215A 15.82 9.809e- 
11 104-129 


1187 


BL00983 


Ly-6 / u-PAR domain 
proteins . 


BL00983C 12.69 2.76le- 
10 77-93 


1188 


BLO0878 


0 rn / DAP / Ar g 

decarboxylases family 2 
pyridoxal-P attachment 
si. 


BL00878B 10 9^ OflO#»- 
16 189-204 BL00878C 
17.74 8.435e-15 225- 
245 BL00878F 19.67 
3.625e-13 379-402 
BL00878D 16.56 1.621e- 
09 270-289 


1191 


PD02939 


PROTEIN GLUTATHIONE 
SYNTHETASE SY. 


PD02939B 10.10 2.723e- 
12 203-220 PD02939C 
20.01 1.000e-ll 224- 
252 


1193 


PRQ0345 


STATHMIN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800e- 
28 72-101 PR00345E 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESUtTS* 








8.54 7.652e-28 149-174 
PR00345C 4.54 9.100e- 
28 101-125 PR00345D 

10 97 1 9fi4F»-94 19^-. 

149 PR00345A 13.46 
5.645e-16 43-62 


1194 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800e- 
28 108-137 PR0D34 C 5E 
8.54 7.652e-28 185-210 
PR00345C 4.54 9.l00e- 
28 137-161 PR00345D 
10.97 1.964e-24 161- 
185 PR00345A 13.46 
5.645e-l6 79-98 


1195 


PF00995 


Seel family. 


PF00995B 17.37 1.120e- 
13 224-264 


1196 


BL009B2 


Bacterial -type phytoene 
dehydrogenase proteins . 


BL00982A IB. 41 6.738e- 
11 15-47 


1197 


BL01298 


Dihydrodipicolinate 
reductase proteins. 


BL01298A 13 .90 5.959e- 
09 51-73 


1203 


BL00061 


Short -chain 

Hph vrH rnn ana ope /roHlt^hDCD 
uciiyuLuycuaaCo/ lcuulucIoc 

s family proteins . 


BL00061B 25.79 l.OOOe- 
14 i R9- 1 ojn 


1204 


PR00118 


BETA- LACTAMASE CLASS A 


PR00118F 16.42 9.386e- 

f!Q 91 *J _ 99 Q 


1206 


BL01183 


ubiE/C0Q5 

methyl transferase family 
proteins . 


BL01183B 21.31 1.429e- 
37 184-229 BL01183D 

Z / . / ± O . 3J36 iS / Z D 4 - 

307 BL01183A 13.25 
3.250e-23 51-73 
BL01183C 10.77 5.295e- 


1208 


BL00979 


G-protein coupled 
receptors family 3 
proteins . 


BL00979L 20.63 2.485e- 
09 105-146 


1209 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 4.857e- 
14.20 1.818e-09 45-55 




rKUU Uto 


HOMO - TVDTT 7 THP TTTMriPP 

SIGNATURE 


14 227-241 PR00048A 
10 R9 4 1 99- 
213 


1213 


PR00450 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 1.720e- 
10 20-42 PR00450C 
12.22 3.506e-09 56-78 
PR00450D 16.58 6.769e- 
09 44-64 


1216 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 5.598e- 
10 179-230 


1219 


PRnn4^£ 


RTRO^flMZXT, DPOTPTN P9 1 

SIGNATURE 


ppnn4c;fiT? ~k n£ 74«o_ 

rKUvvSOCi O.V/O 3 ■ OS 

11 249-264 


1222 


trU\J UUOD 


PROTFTN FTNP-FTMrtFR 

METAL -BIND I . 


pnnnnKfi it 99 7 911 p_ 
15 295-308 PD00066 

~\"\ 99 7 911p»-1 c ; 406- 

1 J . / .ZJlC 1J 4UD- 

419 PD00066 13.92 
2.286e-12 378-391 
PD00066 13.92 7.857e- 
12 434-447 PD00066 
13.92 3.348e-ll 350- 
363 


1223 


BL50058 


G-protein gamma subunit 
profile. 


BL50058 27.23 l.OOOe- 
40 13-61 


1226 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 8.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL00437A 18.82 l.OOOe- 
40 49-101 BL00437B 
16.28 1.000e-40 114- 
168 BL00437C 21.86 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








1.000e-40 190-239 
BL00437D 25.72 l.OOOe- 
40 248-301 BL00437E 
23.95 1.000e-40 327- 
379 


1230 


BL0116O 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 8.297e- 
10 6-60 


1231 


PR00735 


GLYCOSYL HYDROLASE 
FAMILY 8 SIGNATURE 


PR00735A 11.19 6.857e- 
09 391-405 


1232 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1233 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1235 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 2.776e- 
09 75-121 


1237 


BL00027 


1 Home obox • doma i n 
proteins . 


BL00027 26.43 1.818e- 
21 36-79 


1243 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 1.184e- 
11 10-25 


1246 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168L 9.47 2.837e- 
10 31-46 PD01168L 
9.47 4.490e-10 174-189 
PD01168L 9.47 7.6l2e- 
10 183-198 


1249 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 2.800e-10 
183-196 


1254 


BL001B3 


Ubi qui tin -conjugating 
enzymes proteins. 


BL00183 28.97 2.440e- 
36 96-144 


1255 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- 
11 8-52 


1256 


BL00373 


Phosphoribosylglycinamid 
e formyl transferase 
proteins. 


BL00373C 10.35 3.348e- 
12 143-156 


1258 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011B 13.08 3.217e~ 
10 174-193 


1259 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 8.286e- 
10 31-40 


1261 


PR00070 


D I HYDROFOLAT E REDUCTASE 
SIGNATURE 


PR00070D 11.63 l.OOOe- 
15 112-127 PR00070C 
13.09 9.500e-15 51-63 
PR00070A 12.92 5.500e- 
12 16-27 


1262 


BL00462 


Gamma - 

glutamyl transpeptidase 
proteins. 


BL00462A 20.89 6.438e- 
24 140-183 BL00462B 
17. 8B 5.500e-20 230- 
267 BL00462C 27.41 
2.023e-ll 292-347 


1263 


BL00038 


Myc-type, 'helix- loop- 
helix' dimerization 
domain proteins. 


BL00D3BB 16.97 9.455e- 
11 62-83 


1264 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- 
11 17-61 


1266 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837C 17.21 2.714e- 
18 165-182 PR00837A 
14.77 4.512e-12 86-105 
PR00B37D 11.12 7.577e- 
12 201-215 


1269 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 9.308e- 
22 40-63 PR00449B 
13.50 1.000e-16 137- 
160 PR00449D 10.79 
3.520e-ll 102-116 


1270 


BL00276 


Channel forming colicins 
proteins . 


BL00276A 8.87 1.500e- 
09 17-29 


1275 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327C 15.47 9.769e- 
09 228-243 


1276 


PRO 04 12 


EPOXIDE HYDROLASE 


PR00412B 12.59 7.894e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


* 




SIGNATURE 


12 119-135 PR00412C 
11.30 1.857e-ll 165- 
179 PR00412A 13.23 
3.400e-ll 100-119 


1277 


PF00756 


Putative esterase. 


PF00756C 14.12 9.538e- 
10 127-157 


1279 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1280 


BL01220 


Phosphatidylethanolamine 
-binding protein family 
proteins . 


BL01220C 14.75 9.348e- 
15 248-276 


1285 


BL00518 


Zinc finger/ C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 2.286e- 
10 33-42 


1287 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7.182e- 
11 288-343 


1292 


PR00802 


SERUM ALBUMIN FAMILY 
SIGNATURE 


PR00802B 16.51 1.6l0e- 
10 81-105 


1297 


PR00716 


M- PHASE INDUCER 
PHOSPHATASE SIGNATURE 


PR00716C 17.65 5.696e- 
09 23-44 


1298 


BL00478 


LIM. domain proteins. 


BL00478B 14.79 6.478e- 
14 268-283 


"1301 


BL00127 


Pancreatic ribonuclease 
family proteins. 


BL00127C 31.49 3.571e- 
28 82-126 BL00127B 
26.57 8.800e-28 23-68 


1302 


PR00637 


TYPE 3 BOMBESIN RECEPTOR 
SIGNATURE 


PR00637E 11.27 4.250e- 
09 290-306 


1307 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 5.500e- 
17 13-38 BL00215A 
15.82 1.000e-16 226- 
251 BL00215A 15.82 
2.658e-13 107-132 


1308 


PR00898 


VASOPRESSIN V2 RECEPTOR 
SIGNATURE 


PR00898H 11.34 4.682e- 
09 552-572 


1309 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301B 5.49 2.731e- 
09 390-401 


1310 


BL00983 


Ly-6 / u-PAR domain 
proteins. 


BL00983C 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3.132e-09 12-22 


1313 


BL00194 


Thioredoxin family 
proteins . 


BL00194 12.16 i.SOOe- 
11 15-28 


1314 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 8.969e- 
10 53-97 


1316 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1320 


BL00783 


Ribosomal protein L13 
proteins . 


BL00783C 22.43 6.S59e- 
24 87-117 BL00783A 
14.55 1.600e-19 8-33 
BL00783B 12.76 3.500e- 
12 74-B6 


1327 


PF00514 


Armadillo/beta- catenin- 
like repeat proteins. 


PF00514A 31.30 7.268e- 
11 82-120 


1329 


BL0003 0 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 6.294e- 
11 129-148 BL00030B 
7.03 4.789e-09 168-178 


1331 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR00497A £.92 7..239e- 
09 25-43 


1332 


PR00161 


NICKEL - DEPENDENT 
HYDROGENASE/B - TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e- 
09 317-337 


1333 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.769e- 
33 10-49 


"133 6 


PR00700 


PROTBIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 2.200e- 
09 262-281 


1337 


PR00700 


PROTEIN TYROSINE 


PRO07O0D 12.47 2.200e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOSPHATASE SIGNATURE 


09 211-230 


1340 


PR00860 


VERTEBRATE 

ME TALLOTHIONE IN 

SIGNATURE 


PR00860A 5.46 5.034e- 
13 5-18 


1341 


BL00893 


mutT domain proteins. 


BL00893 18.99 6.750e- 
16 46-71 


1343 


BL01282 


BIR repeat proteins. 


BL012B2B 30.49 5.974e- 
21 383-422 


1344 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE. 


DM00099B 14.73 8.313e- 
09 417-427 


1345 


BL00923 


Aspartate and glutaraate 
racemases proteins. 


BL00923B 11.41 5.935e- 
10 135-146 


1348 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 7.231e- 
13 44-57 


1350 


PR0O193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 3.571e- 
32 416-445 PR00193C 
12.60 6.318e-31 179- 
207 PR00193B 11.69 
3.571e-24 133-159 
PR00193E 19.47 9.069e- 
22 470-499 PR00193A 
15.41 1.783e-20 77-97 


1352 


PR00447 


NATURAL RESISTANCE- 
ASSOCIATED MACROPHAGE 
PROTEIN SIGNATURE 


PR00447E 9.73 1.554e- 
15 299-319 PR00447D 
13.54 3.408e-15 200- 
224 PR00447A 12.73 
6.357e-ll 97-124 
PR00447G 6.69 9.877e- 
10 353-373 


1353 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


BL00303A 21.77 6.667e- 
26 45-82 BL00303B 
26.15 1.000e-24 93-130 


1355 


BL00039 


DEAD- box subfamily ATP- 
dependent heli cases 
proteins. 


BL00039D 21.67 5.950e- 
29 375-421 BL00039A 
18.44 7.136e-29 99-138 
BL00039C 15.63 4.000e- 
18 225-249 BL00039B 
19.19 3.182e-14 141- 
167 


1357 


PF00615 


Regulator of G protein 
signalling domain 
proteins. 


PF00615B 16.25 2.216e- 
12 84-101 PF00615C 
10.06 8.412e-12 162- 
176 


1360 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 9.234e- 
29 10-49 


1361 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925A 5.47 5.091e- 
18 14-29 PR00925B 
3.73 6.143e-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PR00925D 
6.56 1.857e-10 76-87 


1362 


BL01272 


Glucokinase regulatory 
protein family proteins. 


BL01272B 19.61 6.870e- 
30 136-171 BL01272C 
11.68 3.314e-25 249- 
274 BL01272A 6.49 
1.231e-18 99-117 


1363 


BL01272 


Glucokinase regulatory 
protein family proteins. 


BL01272B 19.61 6.870e- 
30 113-148 BL01272C 
11.68 3.314e-25 226- 
251 BL01272A 6.49 
1.23le-18 76-94 


1364 


DM0O179 


W KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.304e- 
09 167-177 


1368 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.592e- 
09 76-96 


1370 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 1.794e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








10 1-19 


1371 


BL00242 


Integrins alpha chain 
proteins. 


BL00242B 8.13 8.615e- 
09 469-479 


1372 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625B 13.48 7.353e- 
19 46-67 PR00625A 
12.84 1.391e-16 14-34 


1373 


BL00434 


HSF-type DNA-binding 
domain proteins. 


BL00434C 23. B5 3.778e- ' 
09 90-130 


1374 


PR00962 


LETHAL (2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00962C 8.00 6.337e- 
09 505-526 


13 75 


PD02475 


MUCIN EPITHELIAL TUMOR - 
ASSOCIATE. 


PD02475A 23.18 8.552e- 
10 1111-1150 


1376 


PD01066 


PROTEIN ZINC FINGER 

ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.571e- 
32 24-63 


13 80 


BL00194 


Thioredoxin family 
proteins . 


BL00194 12.16 8.333e- 
12 48-61 


1381 


DM01970 


□ kw ZK^32.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 1.45Se- 
15 1123-1136 


1383 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
243-254 


1384 


BL00678 


Trp-Asp (wd) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
271-282 


1385 


BL0O303 


S-100/ICaBP type calcium 
binding protein. 


BL00303B 26.15 6.203e- 
10 95-132 


1386 


BL01160 


Kmesin light chain 
repeat proteins. 


BL01160B 19.54 5.042e- 
09 1574-1628 


1387 


BL00518 


Zinc finger, C3HC4 type 
{RING finger), proteins. 


BL00518 12.23 l.OOOe- 
11 52-61 


1389 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 3.600e- 
30 10-49 


1390 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 3.512e- 
31 32-71 


1392 


PR003 08 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 9.723e- 
10 127-137 


1393 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.625e- 
25 8B-110 PRO038OD 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- 
16 208-226 PR00380C 
13:18 6.538e-16 243- 
262 


1394 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BIND I . 


PD00066 13.92 3.400e- 
14 462-475 PD00066 
13.92 8.800e-14 348- 
361 PD00066 13.92 
9.571e-12 405-418 - 
PD00066 13.92 6.087e- 
11 490-503 PD00066 
13.92 8.043e-ll 320- 
333 


1398 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.786e- " 
32 10-49 


1400 


DM012 06 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 7.038e- 
09 270-290 


1406 


riJKJ U V 0 \J 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930A 25.62 7.324e- 
15 363-389 


1407 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 7.500e- 
10 457-476 


1408 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.550e- " 
11 179-193 PR00019A 
11.19 8.826e-10 228- 
242 PR00019B 11.36 
1.360e-09 199-213 
PR00019B 11.36 4.960e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* " 








09 176-190 


1409 


PR00510 


NEBULIN SIGNATURE 


PR00510A 9.09 4.150e- 
12 182-202 PR00510B 
12.96 8.767e-12 210- 
230 PR00510F 9.88 
8.172e-10 58-75 
PR00510D 9.21 2.367e- 
09 251-267 


1410 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.696e- 
09 31-44 


1412 


BL00358 


Ribosomal protein L5 
proteins . 


BL00358B 11.lt l.OOOe- 
40 57-103 BL00358C 
13.75 6.087e-14 122- 
136 BL00358D 14.26 
5.500e-13 143-158 
BL00358A 13.06 1.931e- 
11 33-44 


1414 


BL00282 


Kazal serine protease 
inhibitors family 
proteins . 


BL00282 16.88 7.338e- 
10 511-534 


1415 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 4.300e- 
29 40-77 


1417 


PR00681 


RIBOSOMAL PROTEIN SI 
SIGNATURE 


PR00681G 12.54 2.149e- 
09 38-60 


1418 


DM00973 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE . 


DM00973A 21.17 1.462e- 
09 171-208 


1419 


PRO 03 19 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 1 . 571e- 
09 428-443 


1420 


PD01941 


TRANSMEMBRANE 
COTRANSPORTER SYMP. 


PD01941A 14.81 l.OOOe- 

15.02 7.049e-30 400- 
447 PD01941E it; Q5 
2.475e-20 817-864 
PD01941C 19.96 3.118e- 
19 488-543 PD01941D 
27.18 9.614e-18 641- 
690 PD01941F 28.52 
5.382e-15 1038-1093 


1422 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 8.043e- 
12 199-217 


1423 


PR00209 


ALPHA/ BETA GLIADIN 
FAMILY SIGNATURE 


PR00209B 4.88 t>.3l8e- 
11 1009-1028 


1424 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002A 14.19 8.200e- 
14 367-386 BL50002A 
14.19 9.250e-12 298- 
317 BL50002A 14.19 
4.462e-ll 208-227 
BL50002B 15.18 l.OOOe- 
09 244-258 


1425 


PF0O628 


PHD- finger. 


PF00628 15.84 3.045e- 
12 330-345 


1426 


PF00628 


PHD -finger. 


PF00628 15.84 3.045e- 
12 377-392 


1427 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.114e- 
16 281-299 PR00405A 
17.71 4.306e-14 262- 
282 


1428 


BL0O039 


DEAD -box subfamily ATP- 
dependent he li cases 
proteins. 


BL00039D 21.67 5.219e- 
34 147-193 


1429 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 8.920e- 
10 577-592 


1430 


PR00378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.650e-10 166- 
186 


1431 


PR00928 


GRAVES DISEASE CARRIER 


PR00928B 13.53 3.769e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PROTEIN SIGNATURE 


10 103-124 


1433 


BL01113 


Clq domain proteins . 


BL01113B 18.26 7.049e- 
15 14-50 BL01113C 
13.18 7.000e-12 82-102 


1434 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 7.983e- 
10 135-150 


1436 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 l.OOOe- 
12 84-103 


1438 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2 . 500e- 
09 250-268 BL00290A 
20.89 4.000e-09 188- 
211 


1440 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 38-52 


1441 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 88-102 


1444 


BL00422 


Granins proteins. 


BL00422D 19.48 l.OOOe- 
08 114-138 


1445 


PD01841 


PHOS PHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 73-123 PD01841B 
14.35 1.000e-40 144- 
185 PD01841D 17.87 
l.OOOe-40 206-258 
PD01841F 13.36 l.OOOe- 
40 296-345 PD01841G 
24.26 l.OOOe-40 349- 
403 PD01841I 23.00 
l.OOOe-40 494-536 
PD01841J 14.94 l.OOOe- 

18.42 l.OOOe-40 1083- 
i i oc PD01R4TR 1ft fifl 
9.719e-38 258-296 
PD01841K 14.81 l.OOOe- 
35 1041-1071 PD01841H 
21.30 3.189e-31 435- 
472 PD01841C 13 . 78 
1.000e-25 185-206 
PD01841M 10.82 1.250e- 
20 1175-1194 


1446 


PF00816 


H-NS hi stone -family. 


PF00816B 13.84 8.875e- 
09 190-220 


1447 


PR00048 


C2H2 -TYPE ZINC FINGER 
SIGNATURE 


PR00D48A 10.52 2.080e- 
09 402-416 


1448 


DM00315 


072 RIBONUCLEASE 
INHIBITOR. 


DM00315D 18.40 7.393e- 
09 23-67 


1451 


BL00030 


Eukaryotic RNA-binding 
region RNP-l proteins. 


BL00030B 7.03 2.800e- 
10 94-104 


1454 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688D 13.44 7.146e- 
09 382-405 


1455 


PFO0777 


Sialyl transferase 
family. 


PF00777C 18 60 2 929e- 
22 4-59 


1457 


BL00927 


Trehalase proteins. 


BL00927C 10.83 8.085e- 
09 42-53 


1460 


BL00545 


Aldose 1-epimerase 
proteins . 


BL00545C 11.28 7.353e- 
17 169-182 BL00545A 
10.20 2.071e-15 73-89 
BL00545B 13.10 3.942e- 
09 140-153 


1466 


PR00097 


ANTHRAN ILATE SYNTHASE 
COMPONENT II SIGNATURE 


PR00097C 9.42 9.069e- 
09 233-245 


1472 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins. 


BL01129E 13.25 5.250e- 
22 170-195 BL01129C 
25.56 9.526e-18 63-106 


1473 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 2114-2145 


1475 i 


PFO0686 


Starch binding domain 
proteins. 


PF00686A 13.45 9.100e- 
09 267-277 
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NO. 


DESCRIPTION 


RESULTS* 


1477 


PF60566 


Probable rabGAP domain 
proteins . 


PF00566A 12.64 7.333e- 
10 466-476 


1478 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030B 7.03 9.400e- 
10 43-53 


1479 


DM00406 


GLIADIN . 


DM00406 7.73 8.541e-10 
292-305 


1480 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.385e- 
15 69-87 BL00290A 
20.89 5.091e-ll 12-35 


1481 


PR00150 


PH0S PHOENOL PYRUVATE 
CARBOXYLASE SIGNATURE 


PR00150F 10.45 9.039e- 
09 21-51 


1482 


PF00780 


Domain found in NIK1- 
like kinases, mouse 
citron and yeast ROM. 


PF00780I 14.69 4.825e- 
09 107-137 


1483 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 1.153e~ 
09 108-162 


1485 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.909e- 
25 17-56 


1486 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 34-50 


1488 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 9.586e- 
10 116-162 


1490 


BL00166 


Enoy 1 - CoA 

hydratase/isomerase 
proteins . 


BL00166D 22.87 2.607e- 
24 190-226 BL00166C 
18.93 5.500e-14 140- 
167 BL00166B 16.92 
9.357e-ll 93-115 


1491 


BL0 0452 


Guanylate cyclases 
proteins . 


BL00452D 28.59 3.700e- 
31 63-106 BL00452E 
11.92 3.045e-13 115- 
131 


1492 


PRO 0 019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 3.667e- 
09 532-546 


1497 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107B 13.31 l.OOOe- 
11 384-400 BL00107A 
18.39 5.345e-ll 322- 
353 


1500 


PF00876 


Ogre family. 


PF00876E 7.99 1.947e- 
10 107-117 


1502 


BL00027 


' Homeobox ' domain 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1503 


BL00027 


• Homeobox 1 domain 
proteins. 


BL00027 26.43 4.789e- 
24 112-155 


1505 


BL01177 


Anaphylatoxin domain 
proteins . 


BL01177E 20.64 5.800e- 
24 448-475 BL01177C 
17.39 5.333e-19 402- 
421 BL01177B 13.61 
7.840e-16 155-171 
BL01177D 17.50 1.900e- 
15 427-445 


1506 


BL00972 


ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 5.500e- 
14 311-336 BL00972A 
11.93 7.429e-14 48-66 
BL00972E 20.72 8.759e- 
10 341-363 


1512 


BL0 0523 


Sulfatases proteins. 


BL00523E 19.27 4.536e- 
22 76-106 BL00523D 
9.89 1.563e-ll 40-52 
BL00523F 10.85 4.162e- 
09 159-170 BL00523G 
9.46 5.333e-09 256-266 


1516 


BL00914 


Syntaxin / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 168-218 


1518 


BL00600 


Aminotransferases class- 
Ill pyridoxal -phosphate 
attachment si. 


BLO06OOA 17.98 6.143e- 
19 98-122 BL00600E 
16.43 1.771e-17 302- 
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NO. 


DESCRIPTION 


RESULTS* 








331 BL00600G 12.43 
9.625e-17 377-396 
BL00600B 19.60 5.091e- 
15 160-186 BL00600C 
16.18 6.040e-12 190- 
206 BL00600F 8.77 
1.000e-ll 343-356 
BL00600D 8.71 l.OOOe- 
10 281-295 


1523 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 9.600e- 
18 41-82 


1528 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320B 12.19 4.774e- 
11 192-207 PR00320B 
12.19 8.839e-ll 272- 
287 PR00320B 12.19 
9.743e-10 106-121 
PR00320A 16.74 l.B78e- 
09 192-207 PR00320A j 
16.74 2.317e-09 106- 
121 PR00320A 16.74 
8.683e-09 272-287 
PR00320C 13.01 8.800e- 
09 106-121 


1538 


DM0197O 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 4.5Q8e- 
15 171-184 


1539 


PF00781 


Diacylglycerol kinase 
catalytic domain 
proteins (presumed) . 


PF00781D 11.11 7.593e- 
10 103-127 


1540 


PR00965 


OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURE 


PR00965H 10.73 1.231e- 
29 312-334 PR00965E 
12.93 5.846e-29 172- 
195 PR00965F 5.98 
1.123e-28 209-231 
PR00965C 15.04 l.OOOe- 
27 131-151 PR00965D 
5.84 1.000e-27 150-170 
PR00965G 8.52 2.440e- 
27 258-279 PR00965B 
4.80 8.650e-26 88-109 
PR00965A 12.52 1 . 000e- 
25 35-55 PR00965I 
3.91 6.442e-25 385-406 


1541 


BL01013 


Oxy s t er o 1 - bi nding 
protein family proteins. 


BL01013D 26.81 9.719e- 
17 163-207 


1543 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699C 24.84 1 . OOOe- 
40 599-646 PD02699A 
8.91 2.286e-34 219-248 
PD02699B 18.28 6.143e- 
21 485-509 


1544 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00O49D 0.00 7.857e- 
10 182-197 PR00049D 
0.00 7.102e-09 67-82 


1547 


BL00951 


ER lumen protein 
retaining receptor 
proteins. 


BL00951C 19.35 l.OOOe- 
40 93-142 BL00951D 
13.94 8.714e-40 142- 
177 BL00951A 15.10 
1.000e-38 2-38 . 
BL00951B 14.23 6.250e- 
33 38-69 


1548 


BL00536 


Ubiqui tin -activating 
enzyme proteins. 


BL00536F 13.65 8.920e- 
30 279-318 BL00536D 
22.91 5.737e-24 21-65 
BL00536E 16.94 4.696e- 
18 248-279 


1549 


PR00139 


AS PARAGINAS E / GLUTAM INASE 
FAMILY SIGNATURE 


PR00139C 11.72 9.679e- 
09 550-569 


1553 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.119e- 
09 58-73 
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NO. 


DESCRIPTION 


RESULTS * 


1556 


BL00061 


Short -chain 

dehydrogenases /reductase 
s family proteins 1 


BL00061B 25.79 6 . 276e- 
13 67-105 


1557 


BIi01228 


Hypothetical cof family 
proteins. 


BL01228D 17.44 8.105e- 
12 107-132 


1558 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1559 


BL01228 


Hypothetical cof family 
proteins. 


BL01228D 17.44 8.105e- 
12 107-132 


1562 


BL00522 


DNA polymerase family X 
proteins . 


BL00522C 11.90 6.600e- 
18 412-436 BL00522B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6.000e-16 279-326 
BL00522E 19.63 6.123e- 
14 502-532 BL00522F 
14.90 2.385e-13 551- 
575 


1563 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins . 


PF00651 15.00 1.947e- 
11 46-59 


1564 


BL00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 2.823e- 
10 324-376 


1566 


BLQ1013 


Oxysterol -binding 
protein family proteins . 


BL01013D 2*. 81 8.594e- 
17 184-228 BL01013C 
9.97 4.906e-12 14-24 


1567 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 3.400e-10 
378-^89 Rr.nnfi7R q K7 

J (O JDJ OUUUD I O J - O / 

5.800e-10 418-429 
BL00678 9.67 8.800e-10 
295-306 


1570 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479B 12.57 5.23Se- 
17 297-313 BL00479A 
19.86 6.625e-15 271- 
294 BL00479A 19.86 
2.667e-14 147-170 
BL00479B 12.57 6.294e- 
12 173-189 


1576 


PR00665 


OYYTDCTN PFPFPTOW 

I 1 U L XXl fid ±\JI\. 

SIGNATURE . 

• 


rKUUDODu lZ , JO 4 , O / JC 

24 364-384 PR00665D 
9.93 1.200e-22 138-155 
PR00665F 11.73 4.000e- 
22 337-354 PR00665C ' 
5.89 1.000e-20 65-80 
PR00665B 5.29 4.337e- 
19 24-39 PR00665E 
5.60 2.929e-15 246-260 
PR00665A 5.99 5.622e- 
15 11-25 


1577 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL ( 
DIHYDROPTERIDINE . 


DM00099B 14.73 9.308e- 
10 127-137 


1579 


BL00524 


Somatomedin B domain 
proteins . 


BL00524A 9.65 6.776e- 
14 52-73 


1580 


PD02 894 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894B 13.93 6.959e- 
16 182-215 PD02894A 
21.96 2.125e-10 57-103 


1581 


BL00411 


Kinesin motor domain 
proteins . 


BL00411C 15.04 5.292e- 
12 32-54 BL00411H 
15.66 4.44le-ll 245- 
276 


1582 


PR00604 


CLASS I A AND IB 
CYTOCHROME C SIGNATURE 


PR00604A 11.13 2.440e- 
09 79-87 


1584 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 l.OOOe- 
10 225-238 


1585 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 9.455e- 
11 125-145 


1586 


DM01354 


kw TRANSCRIPTASE REVERSE 
II ORF2. 


DM01354S 11.61 7.750e- 
09 474-495 
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NO. 
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1587 


PRO 00 72 


MALIC ENZYME SIGNATURE 


PR00072B 13.77 7.955e- 
33 180-210 PR00079A 
12.75 6.040e-25 120- 
145 PR00072C 11.42 
2.286e-24 216-239 
PR00072D 10 77 3 400e- 
22 276-295 PR00072E 
10.54 1.360e-19 301- 
318 PR00072G 10.45 
5.304e-19 433-450 
PR00072F 8.87 5.935e- 
15 332-349 


1589 


BL00191 


Cy t o c hr orae b 5 f ami 1 y , 
heme -binding domain 
proteins . 


BL00191H 15.64""l.537e- 
22 61-113 BL00191K 
17.38 9.027e-12 398- 
442 


1590 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 7.716e- 
13 211-224 DM01970B 
8.60 2.157e-12 94-107 


1591 


DM00517 


5 kw NUCLEAR 60 . 7 NUP1 
CHROMOSOME . 


DM00517B 10.96 6.625e- 
16 1175-1191 DM0051 7 A 

8.21 1.000e-ll 1015- 
1026 


1592 


BL00037 


Myb DNA- binding domain 
oroteins rpnpat 1 nroteins 

proteins . 


BL00037B 15.92 3.250e- 
27 116-142 BL00037A 
16.68 2.500e-24 83-107 
BL00037A 16.68 3.250e- 
12 31-55 BL00037B 
15.92 3.526e-ll 64-90 
BL00037C 16.86 9.654e- 
10 146-164 


1595 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 1.514e- 
09 110-127 


1598 


PF00628 


PHD-f inger . 


PF00628 15.84 3.250e- 
11 1667-1682 


1599 


PRO 00 14 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014D 12.04 5.500e- 
09 980-995 


1600 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 6.571e- 
10 30-39 


1602 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 5.402e- 
10 136-187 


1605 


PF00651 


BTB (also known as BR— 
C/Ttk) domain proteins. 


PF00651 15 00 3 571e- 
10 44-57 


1607 


BL00252 


Interferon alpha, beta 
and delta family 
proteins. 


BL00252A 18.49 6.657e- 
23 20-57 BL00252B 
19.78 9.125e-16 58-109 


1610 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 l.OOOe- 
08 61-94 


1611 


BL00904 


Protein 

prenyl transferases alpha 
subunit repeat proteins 
proteins . 


BL00904C 8. 98 7.353e- 
10 91-125 BL00904D 
1.47 6.018e-09 127-168 


1612 


PF00168- 


C2 domain proteins . 


PF00168C 27.49 3.250e- 
09 365-391 


1613 


BL00412 


NpurnmnHul in fGAP-43 1 

proteins . 


BL00412D 16.54 6.051e- 
09 932-983 BL00412D 
16.54 7.153e-09 933- 
984 


1614 


BL00559 


Eukaryotic molybdopterin 

oxidoreductases 

proteins. 


BL00559I 13.63 3.53le- 
25 54-83 BL00559K 
13.17 2.957e-18 197- 
224 BL00559J 19.63 
6.870e-16 124-176 
BL00559L 13.60 9.000e- 
16 266-284 


1615 


PD01427 


TRANSFERASE 
METHYLTRANS FERASE BI . 


PD01427B 22.45 3.025e- 
22 500-541 PD01427A 
19.94 8.773e-18 439- 
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NO. 
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472 


1616 


BL00115 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 


BL00115Z 3.12 7.4B5e- 
09 152-201 BL00115Z 
3.12 9.603e-09 145-194 


1617 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


BL00303B 26.15 7.750e- 
32 51-88 BL00303A 
21.77 1.947e-31 4-41 


1618 


BL01254 


Fetuin family proteins . 


BL01254F 10.02 8.754e- 
09 137-147 


1619 


PD01888 


PEPTIDE REDUCTASE 
PROTEIN METHI. 


PD01888B 25,10 l.OOOe- 
40 47-97 PD01888C 
21.56 7.000e-30 125- 
155 PD01888A 12.84 
8.800e-15 7-23 


1621 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.455e- 
09 692-704 PR00239E 
1.58 4.580e-09 697-709 
PR00239E 1.58 4.580e- 
09 702-714 PR00239E 
1.58 5.193e~09 703-715 


1622 


PR00860 


VERTEBRATE 

METALLOTHIONEIN 

SIGNATURE 


PR00860B 7.04 1.900e- 
18 27-41 PRO0860C 
9.61 1.474e-14 41-51 
PR00860A 5.46 1.720a- 
14 5-18 


1624 


PR00784 


MITOCHONDRIAL BROWN FAT 
UNCOUPLING PROTEIN 
SIGNATURE 


PR00784D 15.86 8.027e- 
11 77-95 


1626 


BL00325 


Actin-depolymerizing 
proteins . 


BL00325B 21.66 l.OOOe- 
40 93-139 BL00325A 
24.83 6.786e-23 61-93 


1631 


BL00064 


L- lactate dehydrogenase 
proteins . 


BL00064B 23.57 1.000c- 
40 82-130 BL00064C 
17.28 1.000e-40 137- 
182 BL00064E 27.20 
l.OOOe-40 223-275 
BL00064F 25.14 7.882e- 
36 286-331 BL00064A 
21.16 1.000e-33 22-60 
BL00064D 14.19 6.500e- 
31 182-212 


1632 


PR00063 


RIBOSOMAL PROTEIN L2 7 
SIGNATURE 


PR00063B 15.24 9.700e- 
11 59-84 PR00063A 
11.71 l,614e-09 34-59 


1634 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239D 0.00 l.lOSe- 
11 36-49 PR00239C 
3.51 2.538e-09 37-45 


1636 


BL01210 


Caveolins proteins . 


BL01210B 13.92 9.531e- 
10 133-183 


1637 


BL00982 


Bacterial -type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 5.388e- 
11 11-43 


1639 


BL01183 


ubiE/COQ5 

methyl transferase family 
proteins . 


BL01183B 21.31 8.144e- 
12 132-177 


1640 


PR00015 


GRAM-POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 
SIGNATURE 


PR00015B 9.84 8.468e- 
10 128-149 


1641 


PR00320 


G-PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320B 12.19 5.935e- 
11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-10 279-294 
PR00320C 13.01 2.800e- 
10 364-379 PR00320B 
12.19 5.114e-10 279- 
294 PR00320A 16.74 
1.659e-09 279-294 
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RESULTS * 








PR00320A 16.74 2.098e- 
09 229-244 


1642 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 6.464e 7 
09 114-130 


1643 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 l.B06e- 
11 74-94 


1644 


BL00678 


Trp-Asp (WD) repeat 
r> h ^ H ti R oi*oteina 


BL00678 9.67 2.200e-10 
109-120 BL00678 9.67 
5.737e-09 528-539 


1645 


BL01108 


Ribosomal protein L24 
proteins. 


BL01108A 20.33 7.366e- 
17 56-89 


1646 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.270e- 
21 103-125 PR00380D 
9.93 6.30Be-18 386-408 
PR00380C 13.18 7.923e- 
16 332-351 PR00380B 

310 


1647 




,> lruCISv/JM J. INC. IKiVA 

LIGASE . 


Uri V X <i *i «s V- X / . J.O y . / JJ.C" 

37 340-381 DM01242E 

505 DM01242D 23.29 
3.925e-30 420-463 
DM01242B 23.57 8.054e- 
1 fl 265-114 DMQ1242F 
10.61 7.618e-14 526- 
540 


1649 


PD00126 


PROTEIN REPEAT DOMAIN 
TPR MTTn.EA 


PD00126A 22.53 5.500e- 
10 13-34 


1651 


BL01160 


Kinesin light chain 


BL01160B 19.54 6 . 720e- 
11 431-485 


1652 


BL00933 


FGGY family of 
carbohydrate kinases 


BL00933A 17.50 4.673e- 
12 11-35 BL00933E 
11 SO 9 217e-09 456- 
472 


IOjJ 




Tnvnl nrt*i ti nT"nf"pi tsq 


BIj0079RC 17 06 2 988e- 
10 70-115 


1654 




Bacppri al - ^ vne Dhvroene 
dehydrogenase proteins. 


BL00982A 18.41 7.750e- 
17 302-334 


1655 


BL009B2 


Bacterial -type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 7.750e- 
17 282-314 


JLOjO 




V7UCIJ14.11C liULl CU Li XUC 

dissociation stimulators 
CDC24 family sign. 


RTin07A1R 14 77 1 391s- 
16 607-630 


1657 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 7.938e- 
11 114-136 


1658 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.889e- 
10 442-455 


1659 




terminal hydrolases 
family 2 proteins . 


BL00972D 22 55 4.140e- 
12 376-401 BL00972E 
20.72 5.629e-09 446- 
468 


1660 


BL00406 


Actins proteins. 


BL00406D 12.58 6.767e- 
15 188-243 


1661 


PR00105 


CYTOSINE-SPECIFIC DNA 
METHYLTRANS FERASE 
SIGNATURE 


PR00105A 10.36 4.900e- 
13 1140-1157 PR00105B 
12.32 2.800e-12 1259- 
1274 PR00105C 10.86 
1.000e-10 1305-1319 


1662 


BL00280 


Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins. 


BL00280 24.61 3.172e- 
33 3119-3163 


1663 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 5.714e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 8.200e-19 70-85 
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NO. 


DESCRIPTION 


RESULTS* 


1664 


BL00018 


EF-hand calcium-binding 
domain proteins. 


RTiODDl H 7 il c: ncn- •% r\ 
dliuuuio / . t X 9 , USUB'IU 

489-502 


1667 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 8.500e- 
38 7-46 


1669 


BL01153 


NOLI /NOP2/ sun family- 
proteins . 


17 115-141 BL01153C 
13 67 8 977*»-l^ 66 fin 
BL01153B 20.52 1.885e- 
10 13-37 


1671 


PR00678 


PI 3 KINASE P85 
REGULATORY SUB UNIT 
SIGNATURE 


PR0067RH qui mno. 
10 1146-1169 


1672 


BL00598 


Chromo domain proteins. 


BL00598 14.45 8.500e- 
20 27-49 


1673 


PR00326 


GTPl/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.329e- 
09 686-707 


1674 


PR00049 


WILM*S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.580e- 
11 343-358 PR00049D 
0.00 1.286e-10 342-357 


1676 


PR00 74 7 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PR0074 7H 12.76 3.636e- 
19 427-448 PR00747G 
14.50 2.286e-18 368- 
393 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747D 
15.23 8.759e-17 163- 
183 PR00747E 15.13 
8.244e-15 254-272 
PR00747B 7.65 5.355e- 
13 75-90 PR00747F 
13.56 8 .714e-10 311- 
328 


1677 


PR00747 


OT.YPncsvT. HVnUHT bop 

FAMILY 47 SIGNATURE 


PR00747H 12.76 8.636e- 
19 309-330 PR00747G 
14.50 2.286e-18 250- 
275 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747B 
7.65 5.355e-13 75-90 

lJ.bb 8 . 714e- 
10 193-210 


1680 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


dluuo / o y . o / ft.o00e-10 
406-417 BL00678 9.67 


1681 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 4.600e-10 
329-340 BL00678 9.67 
6.684e-09 243-254 


1683 


PR00326 


GTPl/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.346e- 

1J J D j ft x u 


1685 


PRO 0646 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


PR00646H 6".3i 4.188e- 
09 755-771 


1690 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 75-129 


1691 


PR00456 


SIGNATURE 


PR00456E 3.06 7.281e- 
10 418-433 PR00456E 
3.06 7.281e-10 419-434 
PR00456E 3.06 8.125e- 
10 420-435 


1692 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 487-502 PR00456E 
3.06 7.281e-10 488-503 
PR00456E 3.06 8.125e- 
10 489-504 


1693 


BL00674 


AAA-protein family 
proteins . 


BL00674C 22.60 8.043e- 
24 274-317 BL00674B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4.46 4.000e-23 241-2^3 
BL00674D 23.41 8.560e- 
18 338-385 BL00674E 
15.24 1.720e-15 414- 
434 


1697 


PR00409 


PHTHALATE DIOXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1698 


PR00466 


CYTOCHROME B-245 HEAVY 
CHAIN SIGNATURE 


PR00466C 10.17 3.443e- 
13 187-208 PR00466B 
5.03 5.500e-ll 162-186 
PR00466F 9.16 6.159e- 
09 498-517 


1699 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.217e- 
12 283-300 BL00028 
16.07 3.769e-ll 255- 
272 BL00028 16.07 
5.154e-ll 171-188 
BL00028 16.07 5.500e- 
11 227-244 BL00028 
16.07 1.600e-10 199- 
216 


1700 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 3.348e- 
15 62-102 BL01019B 
19.49 4.000e-15 107- 
162 


1703 


PD01065 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.484e- 
12 200-239 


1707 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.558e- 
14 134-153 


1710 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.565e- 
10 116-130 PR00019B 
11.36 4.600e-09 113- 
127 PR00019B 11.36 
7.120e-09 204-218 


1711 


BL01159 


WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 6.523e- 
11 232-247 BL01159 
13.85 5.408e-10 613- 
628 


1712 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 7.000e- 
10 187-203 


1713 


PF00642 


Zinc finger C-xB-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1714 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1715 


BL01115 


GTP-binding nuclear 
protein ran proteins . 


BL01115A 10.22 7.129e- 
09 7-51 


1718 


BL00353 


HMG1/2 proteins. 


BL00353C 14.83 6.018e- 
10 136-183 BL00353B 
11.47 8.866e-09 86-136 


1719 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 5.408e- 
09 432-483 


1721 


BL00038 


Myc-type, « helix- loop - 
helix 1 dimerization 
domain proteins. 


BL00038B 16.97 8.448e- 
12 79-100 BL0003BA 
13.61 4.000e-ll 52-68 


1723 


PD00567 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567C 9.17 8.500e- 
09 418-428 


1724 


BL01279 


Protein-L- 
isoaspartate (D- 
aspartate) O- 
methyl transferase signa. 


BL01279A 24.27 5.663e- 
12 233-281 


1728 


BL00018 


EF-hand calcium-binding 
domain proteins . 


BL00018 7.41 2.059e-ll 
73-86 * BL00018 7.41 
4.176e-ll 157-170 


1730 


BL00*94 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 1.089e- 
09 17-61 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1731 


BL01160 


Kinesln light chain 
repeat proteins . 


BL01160B 19.54 9.676e- 
10 296-350 


1732 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.676e- 
10 316-370 


1733 


PF00850 


Histone deacetylase 
family. 


PF00850F 15.70 4.349e- 
22 246-279 PF00850D 
14.76 6.850e-20 177- 
201 PF00850E 8.88 
8.691e-18 209-235 
PF00850G 22.75 4.098e- 
14 281-323 


1734 


tit rt f\ e a 

BL003 54 


HMG-I and HMG-Y DNA- 
binding domain proteins 
{Ahook) . 


BL00354C 6.61 5.932e- 
09 292-307 


1735 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.263e- 
10 492-502 


1743 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1744 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 l.lBBe- . 
11 5-27 PR00449D 
10.79 2.241e-l0 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1745 


BL00720 


Guanine-nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 8.297e- 
15 136-160 


1746 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.727e- 
11 45-57 PR00081E 
17.54 3 .935e-10 150- 
168 


1747 


BL00439 


Acyl transferases 
ChoActase / COT / CPT 
family proteins. 


BL00439H 18.24 8.435e- 
14 65-91 BL00439G 
13.40 2.895e-12 3-14 


1749 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURB 


PR00819B 10.83 7.158e- 
11 4-20 


1751 


PD0OQ66 


PROTEIN ZINC-FINGER 
METAL- BIND I . 


PD00066 13.92 3.400e- 
14 33-46 PD00066 
13.92 1.000e-13 89-102 
PD00066 13.92 7.000e- 
13 61-74 PD00066 
13.92 6.571e-12 117- 
130 


1753 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 6.516e- 
18 33-77 


1754 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.393e- 
09 490-521 BL00790I 
20.01 2.82le-09 60-91 
BL007901 20.01 6.357e- 
09 287-318 


1756 


PD01066 


PROTEIN ZINC FINGER 
ZJLNl.-* 1NGER METAL- 
BINDING NU. 


PD01066 19.43 9.750e- 
35 10-49 


1758 


DM004 06 


GLIADIN. 


DM00406 7.73 7.600e-09 
653-666 


1762 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR T 


PD02929A 28.27 4.529e- 

£ *i — 1 O 


1765 


PR00326 


GTP1/0BG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 5.950e- 
11 146-167 


1775 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 3.077e- 
14 523-539 


1776 


BL00942 


glpT family of 
transporters proteins. 


BL00942F 15.07 4.343e- 
10 371-389 BL00942B 
20.36 8.040e-09 94-137 


1777 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e- 
09 279-312 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1778 


BL00084 


Copper type II, 
ascorbate-dependent 
monooxygenases proteins . 


BL00084D 25.11 3.700e- 
20 169-224 BL00084B 
24.26 8.134e-16 10-58 
BL00084C 27.71 8.412e- 
11 107-15B 


1779 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 2*. 81 3.758e- 
18 611-655 BL01013A 
25.14 2.881e-15 344- 
380 BL01013C 9.97 
6.308e-13 435-445 
BL01013B 11.33 3.717e- 
12 409-420 


17B3 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC 2 4 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 


1784 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 



* results include in order: accession number subtype; raw score; p- value; postion of 
signature in amino acid sequence. 
TRADOCS:1416223.I(%CRJ0l LDOC) 
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TABLE 4 



SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


2 




Immunoglobulin domain 


2.1e-32 


109.5 


3 


pkinase 


Eukaryotic protein kinase 
domain 


1.3e-29 


110.7 


4 


zf -C2H2 


Zinc finger, C2H2 type 


1.6e-21 


84 .9 


5 


ini 


Fibronectin type III domain 


0 


1097 .1 


6 


fn3 


Fibronectin type III domain 


0 


1035.0 


7 


f n3 


Fibronectin type III domain 


0 


1090.4 


8 


fn3 


Fibronectin type III domain 


0 


1097.1 


9 


TBC 


TBC domain 


4e-40 


146.7 


10 


p450 


Cytochrome P450 


9.5e-17 


62.0 


12 


a nk 


Ank repeat 


6e-20 


79.7 


14 


ig 


Immunoglobulin domain 


1.7e-05 


22.7 


15 


zf-MYND 


MYND finger 


1.3e-06 


35.4 


16 


zf-MYND 


MYND finger 


1.3e-06 


35.4 


17 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-99 


343 .9 


18 


CAP_GLY 


CAP-Gly domain 


1.2e-25 


98.7 


20 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


1.6e-119 


410.5 


21 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


4 .3e-102 


352.6 


22 


pkinase 


Eukaryotic protein kinase 
domain 


2 .4e-79 


277.0 


23 


pkinase 


Eukaryotic protein kinase 
domain 


8 .4e-74 


258.6 


25 


RNA_pol A 


RNA polymerase alpha subunit 


0 


1077.7 j 


26 


Clq 


Clq domain 


1 ,9e-10 


44.4 


27 


Ribosomal L2 
3 


Ribosomal protein L23 


7.8e-32 


111.2 


28 


Ribosomal_L2 
3 


Ribosomal protein L23 


le-29 


104.2 


30 


zf-A20 


A2 0-1 ike zinc finger 


1.5e-10 


48.5 


31 


zf-A20 


A20-like zinc finger 


1.5e-10 


48.5 


32 


FMNjlh 


FMN- dependent dehydrogenase 


"5.4e-179 


608.1 


34 


PID 


Phospho tyrosine interaction 
domain (PTB/PID) 


3 .8e-59 


209.9 


35 


ig 


Immunoglobulin domain 


1.4e-13 


48.8 


36 


±g 


Immunoglobulin domain 


1.4e-13 


48.8 


40 


kinesin 


Kinesin motor domain 


6 .7e~76 


265.6 


44 


Ets 


Ets-domain 


1.4e-56 


182.1 


45 


Ets 


Ets -domain 


1.4e-56 


182.1 


46 


LRR 


Leucine Rich Repeat 


1.7e-13 


58.3 


48 


zf-C2H2 


Zinc finger, C2H2 type 


2.3e-162 


552.8 


49 


IT AM 


Immunoreceptor tyrosine -based 
activation mot 


1.4e-05 


31.9 


50 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


l.le-26 


102.0 


51 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


l.le-26 


102.0 


52 


rae 


Ras family 


8.5e-45 


162.3 


53 


PRK 


Phosphor ibulbkinase 


2.1e-65 


230. 7 


54 


myb_DNA- 
binding 


Myb-like DNA- binding domain 


0.096 


15.2 


55 


vol tage CLC 


Voltage gated chloride channels 


3 . 3e-186 


631.9 


56 


sugar_tr 


Sugar (and other) transporter 


0.00015 


-64.3 


57 


TBC 


TBC domain 


2.2e-37 


137.6 | 


58 


ank 


Ank repeat 


5.9e-25 


96.3 


59 


ank 


Ank repeat 


5.9e-25 


96.3 


67 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


7.9e-49 


175.6 


68 


C2 


C2 domain 


7.9e-54 


192.2 


69 


C2 


C2 domain 


2.3e-54 


194.0 


70 


Kelch 


Kelch motif 


9.4e-99 


341.5 


72 




Immunoglobulin domain 


8.2e-28 


94 .7 


73 


pkinase 


Eukaryotic protein kinase 


8e-69 


242.1 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p - value 


PFAM 
SCORE 






domain 






74 


pkinase 

-■->> 


Eukaryotic protein kinase 
domain 


2 . 8e-38 


140 .6 


76* 


z£- 

C4_Topoisom 


Topoisomerase DNA binding C4 
zinc fing 


5.4e-54 


192.8 


83 


Peptidase S9 


Prolyl oligopeptidase family 


4 .3e-10 


36.8 


84 


fn3 


Fibronectin type III domain 


4 . le-51 


183 . 2 


86 


SH2 


Src homology domain 2 


3 . le-22 


67 . 7 


88 




Immunoglobulin domain 


0 .0091 


14 . 0 


89 


WD40 


WD domain, G-beta repeat 


2 .le-21 


84 . 6 


92 


laminin G 


Laminin G domain 


6.1e-27 


98.5 


93 


AMP-binding 


AMP-binding enzyme 


2 ,4e-3_3 


-37.2 


95 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-59 


211.4 


96 


pkinase 


Eukaryotic protein kinase 
domain 


2 6e-51 


ao j . y 


97 


adh short 


short chain dehydrogenase 


2e-61 


217.5 1 


98 


kinesin 


Kinesin motor domain 


2 . 2e- 8 6 


■inn d 


101 


IRS 


PTB domain (IRS-1 type) 


5 . 4e-3 6 




102 


AAA 


ATPases associated with various 
cellular act 


6.8e-05 


-5.2 


104 


pkinase 


Eukairvotic T}y?nt"f*in Ici'npQP 
domain 


7 1** - 7 "5 

c, . /e — / j 


Oct o 


106 


ras 


Ras family 


O . JC - 




107 


FYVE 


FYVE zinc finger 


5.4e-27 


100 .7 


108 


Cyt reductas 
e 


5*AD/N2in — hi nrfi nrr f"H/+" nrVirrimo 
Brujf i-^nu jjiiiuiiiy v.y t.ocxirorns 

reductase 


/ . /e- oi 


215 . 5 


109 


zf-C2H2 


Zinc finger, C2H2 type 






113 


pkinase 


Eukaryotic protein kinase 
domain 


4e-88 


306.2 


115 


PH 


PH domain 


3 . le- 11 


A c o 
13 


117 


lipocalin 


Lipocalin / cytosolic fatty- 
acid binding pr 


2 , 4e- 14 


R7 R 


118 


pkinase 


Eukaryotic protein kinase 
domain 


4.5e-20 


76.3 


120 


WD40 


WD domain, G-beta repeat 


2 . 4e-14 


61 . 1 


121 


WD40 


WD domain, G-beta repeat 


2 . 4e-14 


'61.1 


123 


IF5_eIF4_eIF 
2 


eIF4-gamma/elF5/eIF2-epsilon 


le-32 


122 . 2 


124 


ig 


Immunoglobulin domain 


6.5e-08 


30.6 


127 


mitq_carr 


Mitochondrial carrier proteins 


3e- 16 


58 . 6 


128 


PP2C 


Protein phosphatase 2C 


2.2e-71 


250.6 


129 


ATP1G1_PLM__M 
AT 8 


ATP1G1/PLM/MAT8 familv 


J • XC"6U 


an £ 
ou . o 


130 


pfkB 


pfkB family carbohydrate kinase 


4.5e-42 


137.1 


133 


ACBP 


Acyl CoA binding protein 


4 4 6e-22 


86 . 7 


134 


rrm 


RNA recognition motif 1 . 


X . AC JI 


no . J 


135 


IQ 


10 calmodulin-bindincr mnh i f 


*2 .OC"UO 


1X.U 


136 


ATP1G1 PLM M 
AT 8 


ATP1G1/PLM/MAT8 familv 




DC 7 


139 


WH2 


Wiskott Aldrich svndrome 
homology region 2 


v . U VJ D / 


*J . X 


140 


zf-C2H2 


Zinc finger, C2H2 type 


1 . 7e-82 


287.5 


141 


Peptidase S2 
6 


Signal peptidase I 


5.7e-10 


35.7 


143 


"ar! 


ADP-ribosylation factor family 


1 . 2e-3 9 


145 . 2 


14 6 


KRAB 


KRAB box 


7.3e-30 


112.6 


14 8 


DUF6 


Integral membrane protein DUF6 


0.096 


8.0 


149 


PDEase 


3' 5' -cyclic nucleotide 
phosphodies terase 


3.86-80 


231.1 


151 


S4 


S4 domain 


l.le-08 


42.3 


153 


tRNA-synt_ld 


tRNA synthetases class I (R) 


3.8e-103 


356.1 


154 


Cyt_reductas 
e 


FAD/NAD-binding Cytochrome 
reductase 


7.8e-6'0 


212.2 


155 


ras 


Ras family 1 


3.6e-2B 


107.0 


157 


actin 


Actin 


3.8e-26 


87.1 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p~ value 




158 


Jacalin 


Jacalin-like lectin domain 


0.09 


-24.9 


160 


Zn carbopept 


Zinc carboxvoentidasf* 


5e- 138 


4 71 Q 


165 


pkinase 


Eukaryotic protein kinase 
domain 


5.1e-67 


236.1 


167 


zf-C3HC4 


Zinc finaer. C3HC4 fcvnp (RIMfi 
finger) 


5 . 3 e- 0 7 


£ / . U 


168 


Ribosomal SI 
5 


Ribosomal protein S15 


1 . le- 06 


29 . 0 


1*9 


DEAD 


DEAD/DEAH box helicase 


le-48 


157 . 0 


171 


DUF59 


Domain of unknown function 
DUF59 


0 .07 


- 1 7 A 
X / . *± 


172 


pkinase 


Eukaryotic protein kinase 
domain 


3.7e-15 


58. £ 


173 


globin 


Globin 


4 . 6e-18 


67 .4 ! 


174 


WW 


WW domain 


7.3e-06 


32.9 


175 


ras 


Ras family 


le-31 " ' 


no a 
xx o . a 


178 


ATP1G1 PLM M 
AT 8 


ATP1G1/PLM/MAT8 family 


2 .5e-17 


/ x . u 


179 


zf -C2H2 


Zinc f increr C2H2 tvnp 






180 


Clq 


CI a* domain 


H ftps _ 7 9 


ZDX . ? 


190 


Yjphosphatas 

6 


Protein- tyrosine phosphatase 


4.9e-287 


967.0 


191 


a f hand 


EF hand 


7.5e-16 


66.1 


193 




domain 




285 . 6 


194 






3 - oe- 


111 . 4 


195 


PALP 


Pyridoxal -phosphate dependent 
enzyme 


2.5e-64 


227.1 


197 


DnaJ 


DnaJ domain 


1.6e-38 


141.4 


199 


RmaAD 


Ribosomal RNA adenine 
dimethylases 


O ft ft ft ■i o 

0 . 00018 


16 . 9 


200 


ocia pxiuopna 

t 


Histidine acid phosphatase 


2 . 5e-10 


37 . 2 


201 


WH2 


WloNUL L AlUIl^ll uyllQiOtTTS 

homology region 2 


n n n o a q 


26 . 9 


204 


vATP- 
synt AC3 9 


ATP 5?vnt"ha t ^f a fP/AP^Q^ cnhnn^ 


1 To- ICQ 




205 


vATP- 
synt_AC3 9 


ATP synthase (C/AC39) subunit 


1 . 6e-139 


4 76 . 9 


206 


ldl recept a 


Low- density lipoprotein 
receptor domain 


2 . 4e-25 


97.6 


209 


ank 


Ank, repeat 


1 . 4e-19 


7ft 4 


210 


Rhomboid 


Rhomboid family 


0 . 0035 


1 . 2 


211 


Clq 


Clq domain 


1 . 6e-70 


247 . 7 


212 


UQ_con 


Ubiquitin-conjugating enzyme 


7 _ 4 e - 74 


258 . 8 


213 


UQcon 


Ubiquitin-conjugating enzyme 


le-53 


191.9 


215 


DEAD 


DEAD/DEAH box helicase 


1 . 8e-43 


140.4 


216 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Ciaudin family 


4.5e-21 


83.4 


218 


Glycos trans 
f_2 


Glycosyl transferases 


4e-21 


83.6 


219 


xg 


Immunoglobulin domain 


0 . 092 


10 . 7 


222 


WD4 0 


WD domain, G-beta repeat i 


7 . 4e-23 


89 .4 


224 


TPR 


TPR Domain 


1 . 2e-08 


42 . 1 


225 


DnaJ CXXCXGX 
G 


DnaJ central domain {4 repeats) 


1 . 5e-38 


141 . 5 


226 


DnaJ^CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1.5e-38 


141.5 


229 


HSP70 


Hsp70 protein 


2.4e-54 


194.0 


230 


GSHPx 


Glutathione peroxidases 


3 .4e-47 


170.2 


231 


tsp_l 


Thrombospondln type l domain 


0.0075 


17.1 


233 


cyclin 


Cyclin 


4 .6e-144 


492.0 


234 


ras 


Ras family 


4.8e-S0 


179.7 


235 


LRR 


Leucine Rich Repeat 


1.2e-30 


115.3 


236 


LRR 


Leucine Rich Repeat 


6.7e-29 


109.4 


237 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.7e-09 


45.0 
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SEQ ID 
NO: 


* x m j xv Ai'ja 


DESCRIPTION 


p-value 


PFAM 
SCORE 


244 


dCMP cvt dea 
m 


Cvfcidine and rfpnwrvhiHt/la^a 
deaminase 


2 . 5e _ 05 


31.1 


245 




Immunoglobulin domain 


6.7e-08 


30.5 


248 


Wilt 


wnfc familv of H<=»vi=»~! nnmpnt al 

signaling protei 


9 . le-270 


742 . 6 


250 


tnito carr 


Mitochondrial carrier proteins 


i XD_cit 


iyj . o 


254 


adenylatekin 
ase 


Adenylate kinase 


1 Q*a - 1 A 
i. . oe- X*k 


re *7 
33 . / 


255 


Cation ef flu 
x 


Cation ef f Iuy -Familv 


*3 Q £=> o ^ 

^ . oe-jj 


124 . 0 


256 


SH3 


SH3 domain 


3 .9e-14 


60.4 


257 


Aa (■rana 


'iTanemomKrano ami' v\s-\ «a « •» «3 
J- x. allbluclIUJi. oiic dluluO elClCx 

transporter protein 


2 . 6e-52 


187.2 


258 


adenylatekin 
ase 


Adenylate kinase 


2.1e-110 


380.2 


259 


HIT 


HIT fami 1v ' 

xx x x xauixxy 


8 . 2e-07 


25.3 


260 


Bacterial^ PQ 
0 


PQQ enzyme repeat 


1.6e-15 


65.0 


262 


X. ULCuOkyUIC 


riuLca&oine «"-cype ana e-cype 


6 . 5e-64 


225 . 7 


267 


olc i nasp 


ct ujvai yuuiL protein Kinase 
doma "i n 


6 . 3e-27 


101 . 0 


270 


filament 


X4JUCi.UlCUi.aLC J. J, X diUBli L pxvOCSlIiS 


3 . 2e-150 


512 . 5 


271 


Choline_kina 
se 


Choline/ethanolamine kinase 


2e-67 


237.4 


277 


Ribosomal S7 


Ribosomal protein S7p/S5e 


3 .3e-2Q 


80.6 


279 




Hi u.js.c±x yuL il ptouein Kinase 
doma i n 


3 . 3e-77 


269 . 9 


280 


WD4 0 


WD domain f?— n^ha v-pno a 


7 . 8e- 73 


255.4 


281 


WD4 0 


WD domain f? — Viiaha v*^T-»»s»fr" 
runlet xxi , u jjcLa xepcau 


7 . 8e- 73 


255 . 4 


284 


zf -DHHC 


DHHC* 7 i t\c f incrpr rtomain 




93.4 


287 


Exonuclease 


Exonuclease 


1 .4e-67 


23 8.0 


291 


SAM 


SAM doma "in (Qhori 1 a sal t-»Vi = 
orti'j UvJlllcLJLil wLcillc a. J. pixel 

motif) 


0 . 034 


11 . 2 


292 


SAM 


O-nri uuiltalil \ulCixlc a J. p XX a. 

motif ) 


0.034 


11 . 2 


294 


zf -C2H2 


Zinc finger, C2H2 type 


1 4e-29 


111.7 


295 


zf-C2H2 


Zinc finger, C2H2 type 


2 .2e-125 


430.0 


296 


mi to narr 


iix bocnonuAiai Carrier proteins 


4 . le-59 


205.5 


297 


HMGJdox 


HMQ (high mobility group) box 


6.7e-29 


109.4 


3 02 


fil vron hvana 
vjiy lud LXaila 

f 4 


Glycosyl transferase 


5e-87 


302 . 5 


304 




ttuNA syntne cases ciass 11 \u t k 
and N) 


1 . le-84 


294 . 8 


305 


KRAB 


KRAR hox 


2e-44 


161.0 


306 


rrm 


RNA rprnrrn i t* 1 on mnf i f 


"5 "7 <-k A A 

2 . /e-ft3 


160.6 


308 


7tra 1 


(rhodopsin family) 


3 . ^e- 


±2 o . 1 


309 


DNA_jp o 1 yme ra 
seX 


ywx y (ilex. doc A XcllUJLJ.y 


O A a _ C A 


227.2 


311 


F-box 


F-box domain 


3 . DC UO 


jy . z 


312 


"ig 


Immunocrlooii 1 in Homa ^ n 


o . oe- 13 


fab . y 


313 


Ets 


Ets -domain 


o . ie- o u 


ly z . j 


315 


Kelch 


Kelch motif 


l.Jc- lUb 


367.6 


317 


arf 


ADP- rihoKvl at- i on fart-nr f ami 1 w 




130.4 


318 


sugar_tr 


Sugar (and other) transporter 


0.0003 


-73.1 


320 


pkinase 


DUJVrti.yuLi(. procciu Kinase 
domain 


8 . le-83 


288,6 


322 


pkinase 


Eukaryotic protein kinase 
domain 


4 - 9e-81 


282.6 


324 


XI ink 


Extracellular link domain 


4.5e-l43 


331. S 


326 


ARID 


ARID DNA binding domain 


5.1e-37 


136.4 


327 


HMG_box 


HMG (high mobility group) box 


6.7e-29 


109.4 


328 


cadherin 


Cadherin domain 


8.1e-81 


281.9 


331 


chromo 


'chromo' (CHRromatin 
Organization Modifier) 


4e-18 


66.7 


333 


Peptidase__M2 

2 I 


Glycoprotease family 


1.2e-136 


467.4 
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cpn Tr» 
NO: 


Dy7\M KT7S MC 

rrAM NAME 




p-value 


PFAM 
SCORE 


335 




von Willebrand factor type A 

UOlllcl JL II 


2 . 3e-07 


37 . 9 


339 




Dtp ^ami "1 > r 

nas laniiiy 


7 . 8e- 07 


-59.1 


340 




^j-iiL. iiuvjcL / ^.zriz type 


8 . 2e- 64 


225.4 


342 


zf -C2H2 




2 . 4e- 85 


297 . 0 


343 


icr 


X mmunog 1 obu 1 i n doma i n 


0 . 0005 


18.0 


346 




Pnlfan/nH r nrnh^i n Wi tiacp 

domain 


D ♦ 3c"DD 


0 0 Q n 

a. y . 1 


347 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-65 


229.1 


351 


EGF 


EGF-like domain 


8 . 5e-20 




352 


ank 


Anlc reDeah 


2 . 5e-101 




354 


TBC 


TBC doma i n 


5 . le-15 




355 


PHD 


PHD- f t rirrpr 


J * a© — u / 




358 


DUF6 


Tn t* ^ciiTri 1 mPmhTP rip nyn ("Pi n nrTPfi 

XUIrC^Xal IUCUUJI,G1H« ULULCill UUf a 




lb * 0 


359 


zf -C2H2 


uiu^< itXii^cj. f type 






361 




Jin If rotioai" 


O . Oc"j4 


126 . 1 


362 


Arffian 


protein for Arf 




1 on -t 


363 


CJ.1 1C1 1 1U 


RT5* kanH 


5 . 4e* 10 


46 . 6 


367 


LRR • 




8 . 8e-44 


158 . 9 


368 


laminin_G 


Laminin G domain 


1.5e-33 


121.7 


3 69 




Protein phosphatase 2C 


5 . 3e- 20 


73 . 9 


IT) 


T TM 


LIM domain containing proteins 


9 . 9e- 15 


57 . 1 


J (J 




J\KAb dox 


4 . 8e-23 


90.0 . 


J 1 0 


ion_trans 


Ion transport protein 


2 . 9e-09 


-4 . 2 


1*77 


Beach. 


Beige /BEACH domain 


4 . 9e-208 


704 . 5 


ion 


pkinase 


Eukaryotic protein kinase 
domain 


1 . 6e-94 


327.5 


381 


AMp.Ki nr^-i T-irr 

ftiir-oinaing 


AMP -binding enzyme 


1 . 4e-07 


-140 .3 


382 


HECT 


HECT-domain (ubiquitin- 
transieirasc/ . 


1 .3e-07 


-13.5 


3 84 


ank 




2 . 5 e - 1*0 1 


350.0 


3 86 




Immunoglobulin domain 


9 . 5e-06 


23.6 ! 


3 88 




6inc ninger, L.zaz type 


1 . 7e -42 


154 . 6 


IRQ 




Immunoglobulin domain 




54.3 


3 90 


ill J- t_ U LdiX 


i\ii tocjioijui jlci± earner proteins 


■a ca c n 


233 . 2 


3 92 




r T 1 T^"0 T^i ^ — i i %-i 

IxrK UOITlGlin 


6 . le-17 


69.7 


3 93 


SH3 


QUI rlfvnA S n 




43 . 9 


3 94 


AAA 


nxraocs ctooULlaLcU W1LI1 VaiiOUs 

cellular act 


<i . IB - Z 1 


m c 

Oj - D 


396 


SpGC t rm 




2 . le-67 




397 


zf-C2H2 


Zinc finaer C7H? tvnp 


v • UUOO 


z J . 1 


399 


fn3 


Fibronectin type III domain 


4.1e-102 


352.6 


400 


WD40 


ni/ UUMldlil, u-JJCLa IcpcdL 


n n n n a q 


ZD . O 


401 


El dehydrog 


Dehydrogenase El component 


"3e-119 


409.6 


402 


£n3 


r lDroneccin cype j.j.1 domain 


U 


1719 . 6 


404 


"LRR 


ueucine Kicxi Kepeau 


^ . ie- iu 


48.0 


405 


LCtUllCJllII ! 


Lacmerin uomain 


8 . le- 81 


281 . 9 


406 




^aal zmc Linger 


be- ib 


63 . 4 


410 




xxnov7£>r aomciin 


1 . le-23 


92 . 1 


411 




F~box domain . 


4 . 2e- 06 


33 . 7 


412 


SNF2__N 


SNF2 and others N- terminal 
domain 


5.8e-16 


61.6 


415 


CPSase_L_cha 


Carbamoyl -phosphate synthase 
vcpsase; 


1.5e-172 


586.6 


418 


LRR 


Lpurine Rich Rpnpst* 


3 . 8e-24 


. 0 


419 


DENN 


DENN (AEX-3) domain 


2e-58 


207.5 


420 


RasGEF 


RasGEF domain 


8.1e-43 


155.7 


421 


ank 


Ank repeat 


1.4e-153 


523.7 


424 


G-patch 


G-patch domain 


le-19 


78.9 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2.2e-31 


117.1 


426 


Plexin repea 
t 


Plexin repeat 


0.0023 


24.6 


427 


Plexin_repea 


Plexin repeat 


0.0023 


24.6 
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SEQ ID 
NO: 


Pram mump 




p -value 


PFAM 
SCORE 




t 








429 


Zf-C3HC4 


Zinc finger, C3HC4 type (RING 
f 1 nge r ) 


8.6e-ll 


39.2 


431 


DEAD 


DEAD/DEAH box helicase 


le-66 


214.0 


432 


SH3 


jiIj uuraain 


3 . 4e-16 


67 . 2 


433 


GTP_CDC 


Cell division protein 


2.1e-114 


393.5 


436 




Collagen triple helix repeat 
(20 copies) 


4 . 6e-194 


658 . 1 


4 


Klein is lect 
in 


similarity to lectin domain of 
ricin o 


0 . 0D85 


10.5 


441 


-MX pii d « Uctp U X 

*1 V»» 


Axpna aaapcm carooxyi- terminal 
domai 


1 . 2e-256 


866. 0 


442 


Alpha_adapti 


Alpha adapt in carboxyl- terminal 
domax 


1.8e-235 


795.7 


*± J 




PDZ domain (Also known as DHR 
or GLGF) . 


1 . 9e-65 


230 .9 


445 


T.ON 


ai i'-aepencxenc protease La (jLjON; 
domain 


0 , 00012 


-17.1 


446 


1 2T 


luuiiucioy louuiin uomain 


0 . 00011 


20 .1 


,451 


sushi 


Sushi domain (SCJR repeat) 


1.4e-18 


75.2 


452 


£n3 


Fibronectin type III domain 


1 . 5e-06 


35 .2 


4 54 


pyr laoAci x ue 

Q 


Pyr idoxal - dependent 
decarboxylase conse 


8 . 3e-14 


50 .3 


456 


kinesin 


Kinesin motor domain 


4 . 9e-217 


734.4 


4 57 


Ilcur CflaJl 


Neurotransmitter-gated ion- 
i~iiciiiiiex 


le-175 


597.1 


458 




uoBcpniii 


U . UU 02 


18.7 


468 


bZIP 


bZIP transcription factor 


1.7e-07 


31.8 


4 7fl 


MTD >~ ^ « n f -y* 

i\iif^_tiaiisier 


Nucleotidyl transferase 


6 .3e-06 


-26.3 


471 


WD 4 0 


riu U.(JlUclxn , <j*DGLd iBpsaC 


2e-2 3 


107 . 9 


473 


LIM 


LIM domain containing proteins 


0.00021 


20.7 


4 77 


ZI-RQllDr 


Zn- finger in Ran binding 
protein and others . 


0 . 028 


21 . 0 


479 


WD40 


WD domain, G-beta repeat 


6.5e-18 


73 .0 


480 


KRAB 


ivtOVtS DOX 


le-3 1 


118 . 8 


481 


Arf Gap 


Putative GTP-ase activating 
protein for Arf 


8 ,4e-66 


232.0 


4 85 


quo 


Src homology domain 2 


0 . Oil 


11 . 4 


486 


Clq 


Clq domain 


4.3e-74 


259.6 


" a sin 


osrm 


Double- stranded RNA binding 
mot xf 


l.le-47 


171.9 


4ft Q 

*T O -7 




Zinc ringer, C2E2 type 


4 . 8e-153 


521. 9 


490 


t\±yLldL dUctp L i 

n C 


Alpha adapt in carboxyl- terminal 
domai 


3 . 4e-222 


751. 6 


492 


SKI 


Shikimate kinase 


1.2e-10 


48.8 


497 


ein 


ENV polyprotein (coat 
polyp rotexn) 


2 . 6e-22 


77.6 


498 


ahhurirnl acta 
auuy ui ui as c 

2 


f nospnoixpase/ carooxyiesterase 


0.041 


-48.1 


500 


rrm 


RNA recognition motif. | 


5.4e-34 


126.4 


501 


WW 




4 . 6e- 18 


73 . 4 


502 


ig 


Immunoglobulin domain 


l.le-10 


39.5 




dUIlyUiOlaSe 


alpha /bet a hydrolase fold 


0 . 045 


-3 .6 


505 


vwa 


von Wiliebrand factor type A j 
domain 


7.1e-62 


219.0 


508 


Na_K__ATPase 

c 


Na+/K+ ATPase C- terminus 


2.3e-145 


496.3 


509 


Exonuc lease 


Exonuc lease 


1.3e-56 


"201.5 


510 


Glycos trans 
f_l 


Glycosyl transferases group 1 


2.9e-06 


27.0 


511 


Glycos trans 
f_l 


Glycosyl transferases group l 


2.9e-06 


27.0 


512 


Glycos trans 
f_l 


Glycosyl transferases group l 


1.9B-09 


38.5 


514 


pro_isomeras 
e 


Cyclophilin type pep t idyl- 
prolyl cis-tr 


1.8e-63 


221.4 
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SEQ ID 
NO: 




DESCRIPTION 


p-value 


PFAM 


515 


EGF 


EGF - 1 i ke doma i n 


1 . 9e- 18 


74 1 
1*1 . ! 


516 


Suro 


Surp module 


t . J6 JO 


i a n n 
x^iu . u 


523 


lg 


Immunoglobulin domain 


3.3e-06 


25.0 


526 


UBX 


UBX doTtialn 


1 . le-34 




528 


adh zinc 


Zinc-binding dehydrogenases 


2 . 7e-3 4 




530 


SAM 


SAM domain (Sterile alpha 
motif) 


0.046 


10.0 


531 


adh_short 


short chain dehydrogenase 


0.0025 


-34.1 


532 


mito carr 


M i t* o fHriD H-r "i ^ 1 c vv 1 p-r nrnfrpi no 

1 J JL, UWOliviiUl _L CI J. LClIilCX piULClllo 


7 C. ra _ D 7 




533 


mito carr 


Mitochondrial carrier proteins 


2e-61 


213.5 


"534 


h l ol ace 




3 . 5e-183 


622 . 0 


535 


FMO-like 


Flavin -binding monooxygenase- 
like 


0 


1153 .7 


536 


SCAN 


Ov^rUl UULUCLX 11 


bb 




53 7 




u k-lh&l eyntnecaseo class x (x f l>, 
M and V) 


3 . le-136" 


46*6 . 0 


538 




tRNA synthetases class I (I, L, 
M and V) 


3 . le-136 


466 . 0 






tRNA synthetases class I (I, L, 

1*1 dllQ. V 1 


1 . 9e-117 


403 . 6 


540 


uiuna" sync x 


ukimm synunecases ciass i vi, 

171 ctilU V J 


3 . le-136 


466 . 0 


541 


vATP-synt_E 


ATP synthase (E/31 kDa) subunit 


5.9e-85 


295.7 






zinc ringer, lzH^ type 


5 . 5e-69 


242 . 6 


544 


DUF101 


Protein of unknown function 

DU1? 1U1 


8.5e-38 


139.0 


545 


TGFbjpropept 


TGF-beta propeptide 


l.le-67 


238 .2 


547 


WD4. n 
r(JJ4 U 


WD domain, G-beta repeat 


2 . 6e-32 


120 . 8 


548 


RHD 


Rel homology domain (RHD) . 


• 1 . 6e-238 


686 .2 


549 


MMR_HSR1 


GTPase of unknown function 


5.4e-67 


236.0 


551 




HECT-domain (ubiquitin- 
transf erase) . 


4 . 3e-127 


435 . 6 


554 


tirtv. x X aipiia 


Class II histocompatibility 

ant* A nan aim 
ctllLXytrll/ alp 


3 . 5e-74 


259 . 8 


555 


zf-UBRl 


Putative zinc finger in N- ' 


3.3e-16 


67.3 


556 


Kelch 


Kelch motif 


5.5e-29 


109.7 


561 




— uinuiny enzyme 


2 . Be- 06 


-163 . 7 


562 


PABP 


foiy-aaenyiate Dinuing protiein, 


4 . 9e-3 8 


13 9 . 8 


564 


Gag__p3 0 


Gacr P"} fl pnrp aVi&l 1 n-r-r^f- #=» i n 




OTP O 


566 


PWWP 


PWWP rinma •{ n r 


a i a i c 


DO .U 


567 


SCAN 


5?f!AKF rfnma H n 




/jo.y 


569 


pkinase 


LiUJvO LyULiL piUUclll MllaipC 

domain 




^J? 4 . J 


570 


pkinase 


domain 






571 


CNjiydrolase 


Carbon -nitrogen hydrolase 


0.00081 


-79.7 


572 


myosin head 


Mirn a i n hoaH f rn/"v t* v* rl^vm o 4 n i 




1 A QC *5 

14i#b . 2 


573 


mvoBin hp a H 




U 


1490 . 4 


575 


Surp 


ouip UIUUU1C 


x . /e~^ J 


91 . 5 


576 




OUX£J UKJUUXe 


1 . 7e-23 


91 . 5 


577 


DNA pol B 


ijj.xrt puxyiuci de»c xauixxy a 


U 


tub c. 


578 


PDZ 


~PT)7. r\ rvm a i n f A ~l cto Irn r\T»m Q nun 
rut* uuuiaXil InlBO ItuOWl as UilK 

or GLGF) 


Q 1« AO 


42 . 7 


579 


LRR 


Leucine Rich Repeat 


4 . 9e-2l 


83 .3 


580 


neur_chan 


Neurotransmitter-gated ion- 
channel 


5.9e-177 


601.3 


583 


sushi 


Sushi domain (SCR repeat) 


0 I 


1673.0 


584 


DEAD 


DBAD/DEAH box helicase 


7.3e-36 


116.3 


586 


KH-domain 


KH domain 


2.9e-13 


57 .5 


587 


G-patch 


G-patch domain 


2.3e-14 


61.2 


"589 


LIM 


LIM domain containing proteins 


2.3e-36 


133.4 


590 


bromodomain 


Bromodomain 


6.6e-32 


114.7 


591 


bromodomain 


Bromodomain 


6.6e-32 


114.7 
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SEQ ID 
NO " 


PFAM NAME 


UtuUKl tr 1 lUIN 


p- value 


PFAM 

aLUKci 


592 


hormone_rec 


Ligand-binding domain of 
nuclear* hormone 


3.5e-22 


87.1 


593 


PHD 


PHD- finger 


3.8e-12 


53.8 


cqa 


cadherin 


Cadherin domain 


4 . 2e-99 


342 . 7 




pkinase 


Eukaryotic protein kinase 

uOtnciin 


5e-92 


319.2 


eon 


WJJft u 


WD domain, G-beta repeat 


U . 


_ 

26 . 7 


cnn 




r Kj—\3J\r repeat 




262.9 


602 


G_Adapt_CT 


Gamma -adapt in, C- terminus 


l.le-53 


191.8 


603 


pkinase 


Eukaryotic protein kinase 

□□main 


2 . 3e- 86 


300.4 


CftC 

o us 




^ "1 1 nrrari f- nl a Viol -i -w- ham q —5 A* 

Luiidcjen tripie neiix repeoi 
(20 copies) 


Qes. A*J 




erne 


mito Carr 


Mitochondrial carrier proteins 


0 . je- 0 7 


232.3 • 


CrtQ 

OUt) 


■DM WD 

IrWWr 


DU7UTD ^ r~\m ain 


z , be- z 0 


1U / . 0 


609 


PWWP 


PWWP domain 


2.6e-28 


107.5 


613 


CAP_GLY 


CAP-Gly domain 


0 . 0046 


20 . 1 


615 


RFXJ3NA_bind 
ing 


RFX DNA-binding domain 


5.2e-54 


192.9 


616 


kinesin 


Kinesin motor domain 


1 . le-81 


2 84 . 8 


617 


kinesin 


Kinesin motor domain 


8.4e-80 


278.5 


618 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0 . 0098 


13 .1 


620 


MATH 


MATH domain 


7.8e-05 


22.2 


621 


Y_phosphatas 
e 


Protein- tyrosine phosphatase 


1 . 4e-32 


121.6 


622 


pkinase 


Eukaryotic protein kinase 
domain 


4 . 4e-40 


146 . 6 


623 


BNR 


BNR repeat 


2.1e-ll 


51.3 


624 


molybaopteri 
n 


Prokaryotic molybdopterin 
oxidoreductas 


1 . 4e-12 


42 . 2 


625 


TPR 


TPR Domain 


l.le-17 


72.2 


627 


cNMP_binding 


Cyclic nucleotide-binding 
domain 


3 . 7e-58 


206 . 6 


630 


adh_short 


short chain dehydrogenase 


5e- 17 


70.0 


631 


zf-C2H2 


Zinc finger, C2H2 type 


2.1e-88 


307.1 


632 


rrm 


RNA recognition motif . 


4e~05 


30.5 


635 


pkinase 


Eukaryotic protein kinase 
domain 


1.6e-104 


360.7 


636 


Fork_head 


Fork head domain , 


5 . 9e-27 


103.0 


637 


pkinase 


Eukaryotic protein kinase 
domain 


3 . 8e-70 


246 . 5 


642 


TPR 


TPR Domain 


4.8e-08 


40.1 


643 


ex nana 


ej? nana 


1 . 9e-27 


104 . 6 


o4 / 


OWPO XT 

SNr2__N 


Mvrz ana otners w- terminal 
aonuain 


1 . 2e-101 


351 . 1 


648 


PseudoU_synt 

11 ^ 1 


RNA pseudouridylate synthase 


1.9e-55 


197.6 


CCA 

DDJ 




Aixiiu linger, tznz type 


U . U U 0 / 


. / 


031 


ank 


aujs. repeat. 


j. . je-i / 




652 








J *± X . u 




neur chan 


Neurotransmi tter-gated ion- 

rV\Rnnol 
CuOIUlBl 


4 . le- 171 


001,0 


654 


t &p_x 


i uiumuusponuin type i aomain 


a 1o-^^ 

*i . le-** / 


ICQ Q 
J. D Zt . ZJ 


659 


FH2 


roiruiin ncjinoioyy x uomain 


ie- iu / 


J / 1 . « 


661 


pou 


Pou domain - N- terminal to 
homeobox domain 




T /T'i Q 

10^ . y 


662 


C2 


C2 domain 


6.7e-19 


76.2 


663 


C2 


C2 domain 


6.7e-19 




664 


C2 


C2 domain 


6.7e-19 


76.2 


6^7 


GST 


Glutathione S-transf erases . 


9.3e-34 


114.4 


668 


LRR 


Leucine Rich Repeat 


9.3e-31 


115.6 


670 


spectrin 


Spectrin repeat + 


4e-57 


203.2 


671 


I LWEQ 


I /LWEQ domain 


9.5e-101 


341.0 


6*72 


ABC tran 


ABC transporter 


5.3e-60 


212.8 


674 


WD40 


WD domain, G-beta repeat 


4.8e-24 


93,3 
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SEQ ID 
NO; 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


675 


WD40 


WD domain. G-beta rpnpai- 


4 8e - 24 


JJ - J 


676 


LRR 


Leucine Rich Repeat 


0 . 0015 




679 


zf-CCCH 


Zinc finger C-x3-C-x5-C-x3-H 
tvpe 


2.6e-29 


107.7 


660 


zf -C2H2 


Zinc finger, C2H2 type 






681 


CH 


Calponin homology (CH) domain 


2.4e-17 


71.1 


682 


DSPc 


Dual SDecificitv Dhosnhafcase 
catalytic doma 


* . Jc HJ 


JL30 . o 


683 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0 . 051 


ins i 


687 


Synapsin 


Synapsin 


0 


1890 . 8 


689 


PR55 


Protein phosphatase 2A 
regulatory subunit PR 


o 


103 8 . 8 


691 


homeobox 


Homeobox domain 


8 . 5e-30 


112 .4 


696 


Peptidase M2 
4 


metallopeptidase family M24 


2 . 6e-59 


210 . 5 


697 


RhoGEF 


RhoGEF domain 


9 . 5e-35 


128 . 9 


698 


PHD 


PHD- finger 


0.008 


9.3 


701 


zf-C2H2 


Zinc finger, C2H2 type 


5 . 5e-123 




702 


Sulf atase 


Sulf atase 


3e-231 


781.6 


703 


zf-C2H2 


Zinc finger, C2H2 type 


5 m 7 e _20 


7Q ft 


707 


Acyl trans f 


Acyl transferase domain 


1 . le-22 


88.8 


708 


WD4 0 


WD domain, G-beta repeat 


4 .8e-lS 


76.7 


710 


Ran BP1 


RanRPI riornaln 


□ . *± e u b 


-7.3 


713 


DEAD 


DEAD/DEAH box helicase 


9 .9e-42 


134.9 


714 


PH 


s *1 LivJUlu JU LX 


1 . 6e- 09 


39.0 


715 


DSPc 


uu.a.1. apcLiiitiLy paospaaiase, 

catalytic doma 


l . oe-j / 


138.2 


717 


Sialyl trans f 


Sialvltransf Pra«?p faml 1 v 


/ .JC-Jl 


lib . y 


71B 




Immunoglobulin domain 


T q _ O Q 


inn q 


719 


integrin B 


Integrins , beta chain 


o 


Xi^3 .ft 


720 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


l.le-08 


32.4 


722 


Peptidase C2 


Caloain familv cvstpinp 
protease 


3e-145 




723 


ig 


Immunoglobulin domain 


2 . 2e- 05 


22 4 


724 


F-box 


F-box domain. 


0 .007 


23 .0 


725 


Nop 


Put"ah*LV** KTinRMA ni'nHn'nn Hrnna^n 

Oik\JC\Ar*r\ UX11UX11M UUUlctXli 


8 . le -58 


205.5 


726 


Nop 


Putative ftnoRNA hindincr Hnma i n 


8 . le-58 


one c 


727 


WD4 0 


WD domain fi— rvat*a ypno = t- 




99.3 


730 


dSCTTl 


o ti aiiucu (Uin Ull LUX ii'j 

motif 


n 


12 . 1 


731 


dynamin 


Dynamin family 


4 . 2e - 16 


CC Q 
OO .7 


733 


zf-CCCH 


Zinc finger C-xB -C-x5-C-x3 -H 
tvoe 


2 .8e-10 


41.7 


735 


CDP- 

OH_P_transf 


CDP- alcohol 

phosphat idyl trans f erase 


4 . 2e-26 


10 0 1 


738 


DEAD 


DEAD/DEAH box helicase 


8 . 6e-57 


182.5 


739 


TSC22 


TSC-22/dip/bun family 


6\5e-32 


119.5 


742 


ras 


Ras family 




J1D . 7 


743 


PMI_typeI 


Phosphomannose i some rase type I 


1.2e-243 


822.9 


747 


tirypsin 


T i~\m 5? 1 n i 




279.4 


748 


kazal 


inhibitor domain 


2 . 2e- 52 


187 . 4 


749 


efhand 


EF hand 


£ To- ne 


jj . i 


751 


PHD j 


PHD- finger 


*x . 38" ID 


ob . / 


752 


zf-C2H2 | 


Zinc finger, C2H2 type 


3.2e-21 


83 . 9 


753 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


D\le-ll 


49.8 


754 


Ribosomal L3 
9 


Ribosomal L39 protein 


0,00018 


26.7 


755 


PH ' 


PH domain 


3 .6e-14 


55.7 


758 


SCAN 


SCAN domain 


1.4e-53 


191.5 


759 


PA 


PA domain 


0.0065 


23.1 


760 


ar£ 


ADP-ribosylation factor family 


2.2e-19 


77.8 


"761 


CIDE-N 


CIDE-N domain 


2.2e-40 


147.6 
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SEQ ID 
NO : 


PFAM NAME 

— i — 


DESCRIPTION 


p- value 


PFAM 
SCORE 


762 


nistone 


core nistone h^a/H2B/H3/H4 


9 . 9e-53 


188 .6 


TCI 
f O J 




mynu ringer 


4 . le-14 


60 .3 


/ D * 


pou 


Pou domain - N- terminal to 

11_»IU_;_»1J_JX UUIIU1J.I1 


le-52 


188 .6 


767 


vwc 


von niiicDrana racnor cype _> 


2 . 9e-34 


127 . 3 


769 


ef hand 


EF hand 


4.8e-ll 


50.1 


770 




Zinc finger, C4 type (two 

rfnma i no ^ 

V_._HlLCl X IIS / 


2 . 4e-53 


181 . 6 ' 


772 


raS 


Tin "Fatti"! 1 v 

fVGLO LQIII11 y 


7_ _ on 


312.0 


773 


Sulf ai*a<;e 

a ± i» a 9 w 




1 a _ 1 An 

ie- 1 __: 


4 87 .5 


775 


zf -C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


776 


zf -C2H2 


Ziiiit. imger, __ri_ type 


1 . le-12 


55 . 5 


777 




t»i»JL linger^ \__rt<s cype 


1 . le-12 


55 . 5 


778 




RNA recognition motif. 


2 . le-32 


121 . 1 


779 


G6PD 


Glucose- 6 -phosphate 
dehydrogenase 


1 . 5e-76 


236 . 6 


780 




oyctniti icfcau 


3 . 7e-29 


110.3 


781 


lillLU (Jail 


Mitochondrial carrier proteins 


4 . 6e-57 


198 . 5 


782 


SCAN 


£s __-un coma i n 


1 . 3e~24 


95 . 2 


783 '"' 




fu^i domain \aiso Known as DriK 

nr £JT»f3Fl 
ui vjiJvjr / . 


4 . le-07 


37 . 1 


785 


DEAD 


DEAD/DEAH box helicase 


6e-06 


21.7 


786 


rHS 


Ras family 


5 . 3e-39 


143 . 0 


HOT 
/ 0 / 


ruNase rill 


Rinonuclease HII 


2 .5e~67 


237 . 1 


7 Qfi 


DTI DTA Uina 

rlj lr l*__Kina 


Phosphatidylinositol 3- and 4- 
kinases 


5 .4e-108 


372 .2 


795 


cauiici in 


Cadherin domain 


2 . 5e-40 


147 . 4 


796 


ARID 


ARID DNA binding domain 


1.6e-20 


81.6 


797 


uzrypsin 


Trypsin 


9 . 9e-20 


64 . 8 




Ul 


Calponin homology (CH) domain 


3 .7e-15 


63 . 8 


B 01 


ual - 

JJJLI1U J.CL.L.1I1 


Vertebrate galactoside-binding 

1 ant- i r» 

lectin 


4 .le-25 


88.7 


803 


WD40 

_____ 


WD domain, G-beta repeat 


0.00082 


26.1 


806 




TBC domain 


1 . 8e-26 


101 . 4 


807 


Tap 


TBC domain 


1 . 8e-26 


101. 4 


808 


CN_hydrolase 


Carbon- nitrogen hydrolase 


8.8e-80 


278.5 


oil 


Lbi?U . Nt id HM 

F 


Histone-like transcription 
factor 


6e-14 


59.8 


812 


adh_short 


short chain dehydrogenase 


8 .le-20 


79.3 


at a 

C31.fi 


IMP4 


Domain of unknown function 


3.3e~71 


250 . 0 


815 


zf-C2H2 


Zinc finger, C2H2 type 


8.2e-66 


232.1 


Q _L O 


irepC tKNA tty 


Peptidyl-tRNA hydrolase 


1.6e-37 


138.0 


817 


ARID 


ARID DNA binding domain 


2.5e-18 


74 .3 


O Z D 


TDC ^ T 'C /I _ T C 

lrD^elr4 elr 

2 


eIF4 -gamma/ eIF5/eIF2 -epsilon 


1. 6e-32 


121.5 


83 0 


7\ r fn a - 
ril luap 


Putative GTP-ase activating 
protein for Arf 


1.5e-53 


191.3 


Oil 


T T3D 


Leucine Rich Repeat 


2.1e-26 


101.1 


OJi 


iaminin_.fc._ir 


Laminin EGF-like (Domains III 
and V) 


2e-57 


204 .2 


oIt! q 


rrm 


RJMA recognition motif. 


1.3&-22 


88.5 


840 


Y_phosphatas 
e 


Protein- tyrosine phosphatase 


2.6e-119 


409J8 


DAT 


pkinase 


Eukaryotic protein kinase 
domain 


3.4e-100 


346.3 


844 


Ribosomal L2 
2e 


Ribosomal L22e protein family 


le-,4 


228.4 


846 


I BR 


I BR domain 


9e-15 


62.5 


849 


zt-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.4e-07 


26.5 


850 


ZE-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.00016 


18.9 


851 


SET 


SET domain 


5e-30 


113,2 


852 


SRCR 


Scavenger receptor cysteine- 


0 


1025.4 j 
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SEQ ID 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 






rich domain 










Scavenger receptor cysteine- 
ricn uoiucini 


0 


1025 .4 




"I u n!" m "Li try O 

latUdlllaae o 


IMC Lai 10-DeLa"laCtalllaSe 
bupci i. chill Ay 


U . VIZ 


-o . u 


858 


COX6A 


Cytochrome c oxidase subunit 
Via 


3.4e-58 


206.7 


859 


rrm 


RNA recognition moti£. 


5.4e-45 


162.9 


□ ox 


DD V 


fnospnoriDUi.oKinase 


d . xe- 


Z±J . 4 


863 


mito_carr 


Mitochondrial carrier proteins 


2.9e-53 


185.5 




HSF90 


Hsp90 protein 


4 . 7e-158 


538 . 5 


866 


IS 


Immunoglobulin domain 


4e-12 


44 . 1 


867 


zf-C2H2 


Zinc finger, C2H2 type 


7e-135 


461.5 


872 


histone 


Core histone H2A/H2B/H3/H4 


4 . 9e-41 


149 . 8 


874 


CPSase_L_cha 
in 


Carbamoyl -phosphate synthase 
(CPSase) 


2 . le-218 


739 . 0 


879 


Ribosomal_Sl 
2e 


Ribosomal protein S12e 


2.1e-98 


340 .3 


882 


serpin 


Serpins (serine protease 
inhibitors) 


2.5e-42 


145.7 


883 


Patatin 


Patatin 


1 . 2e-51 


182 . 0 


884 


RA 


Ras association (RalGDS/AF-6) 
domain 


0 . 044 


8 . 0 


887 


DUF92 


Integral membrane protein DUF92 


2 . 7e- 12 


54 . 3 


8 89 


sugar tr 


Sugar (and other) transporter 


8 . 2e- 63 


222 . 1 


893 


DUr Z o 


Domain of unknown function 

uucZo 


1 . 3e-43 


158 . 3 




IP__trans 


Phosphatidyl inositol transfer 


6 . 5e- 98 


338.7 






JJtLKU/ UH,}\ri DOX IieilCoSc 


1.5e-48" " 


ISO . D 




1SJLZ 


iU2>^ lamiiy protei.il 


To CI 

/ e-o JL 


i"\ c t 


900 


KE2 


KE2 family protein 


4.3e-51 


183.2 




21 -C<irl^ 


Zinc finger, C2H2 type 


Z . 7Q-0 I 


i ni o 
ZV-i . o 


a no 


ras 


Ras family 


z . Je- /D 


Zv~i . o 


3U4 


TDD 


TPR Domain 


J . Zo-ZZ 


a f * Z 


one 


rinTj 

\30f 


Guanylate -binding protein 


Q Qa OCT 


03J . ± 


907 


GBP 


Guanylate- binding protein 


l.le-239 


809.6 


9 08 


WD40 


WD domain, G-beta repeat 


2 . 6e-26 


100 . 8 


909 


PH 


PH domain 


1 . 3e-09 


39.4 


Q1 n 


21 -C2H2 


Zinc finger, C2H2 type 


2 . 5e-39 


144 . 1 


913 


Epimerase 


NAD dependent 

epimerase/dehydratase family 


5e-07 


— oo — e 

-88 . 5 


921 


TBC 


TBC domain 


1 . 5e-09 


30 . 7 


922 


WD40 


WD domain, G-beta repeat 


1.6e-25 


98.2 


923 


WD40 


WD domain, G-beta repeat 


8 . 2e-07 


36 . 1 


924 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


2.9e-05 


29.1 


925 


UQ_con 


Ubiqui tin- conjugating enzyme 


0 . 00033 


-27 . 6 


926 


CH 


Calponin homology (CH) domain 


3 .3e-53 


190.2 


928 


WD40 


WD domain, G-beta repeat 


5 . 9e-48 


172 .7 


929 


2f -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3 , le-10 


37 .4 


930 


Ribul_P_3_ep 
im 


Ribulose -phosphate 3 epimerase 
family 


7.2e-10$ 


361.8 


on 


Ribul P 3 ep 
im 


Ribulose -phosphate 3 epimerase 
family 


1 >2e-96 


334 .4 


J ^> o 




v-a noma in 


1 "Jo CI 

Z . ZG-aZ 


ZZU . / 


937 


NAP_£amily 


Nucleosome assembly protein 
(NAP) 


l.le-22 


84. £ 


940 


abhydrolase 


alpha/beta hydrolase fold 


0.011 


3.1 


944 


Tropomyosin 


Tropomyosins 


3 .2e-07 


25.1 


948 


pkinase 


Eukaryotic protein kinase 
domain 


3 .4e-75 


263.2 


949 


WD40 


WD domain, G-beta repeat 


1.8e-27 


104.7 


950 


Acyl transfer 
ase 


Acyl transferase 


1.6e-07 


38.4 
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SEQ ID 
NO: 


PPaM NAMP 


nFSCRTPTTOM ~ ~~ 

Uuw^AX tr A -L WIN 


p-value 


PFAM 
SCORE 


951 


SAM 


55AM domain (^hp-rilp si nha 
motif) 




14 , 5 


954 


GFO IDH MocA 


Oxidoreductase family 


1 . 3e - 11 


n 

3_ . u 


955 


BTB 


BTB/POZ domain 


7e-22 


86.1 


956 


BTB 


BTB/POZ domain 


Tea •5*5' 

/e-_:_: 


ob . _. 


957 


CDP- 

OH_P_transf 


CDP- alcohol 

phosphatidyl transferase 




. 6 


959 


ras 


Ras family 


2.4e-97 


336.8 


960 


ras 


Ras family 


□ • *±c- *t J 


XDD . D 


961 


Acetyl trans f 


Acetyltransferase (GNAT) family 


1.2e-08 


42 .2 


962 


adh short 


short chain dphvrrir*otT^ria<ao 


_> . f»e- _» i 


117 . 6 


963 


mutT 


Ract* f^ri a3 ran t" T nrnt*Pi n 


a . oe-oo 


26 . 2 


969 . 


IF-2B 


Initiation factor 2 subunit 
family 


8.4e-193 


653 .9 


970 


RNase PH 


"1 i pvnri ho—iurO oaao f am{ l »f 
-j tjAui, luuuuuxcasc i. amny 


9e-24 


92 . 4 


975 


WW 


WW domain 


5.7e-25 


96.4 


977 


PDZ 


cu/j uOlualli InlSO Known ao UrlK 

or GLGF) 


3 . 6e-21 


83 .7 


978 


7 


R itan^orna 1 t*iT*i*ir n T.T *7 




81 . 0 


979 


IiIM 


uai*j uwitictxii ivWii teixiiJLny proteins 


5 . 8e-42 


152 . 8 


980 


Calfjacnif^nt" t*t 
n 




1 . 7e-297 


1001 .7 


982 


HSP20 


Hfln7fl /al Tihs» rrvo hall -{•« fanjiir 
ntj^u/aiuiia LiyfaLaJLJ.in ianllly 


1 . 2e-10 


43 . 2 


983 


UAJ.UUJ.CU V^D 


NADH ubiquinone oxidoreductase, 
2 0 Kd sub 


4 . 8e-63 


222.9 ■■ 


988 


TBC 


iiJL. aomain 


2 . 2e-50 


1B0 .8 


989 


TBC 


TBC domain 


2.2e-50 


180.8 


993 


f- PWZi -i -n f- ond 


tRNA intron endonuclease 


0 . 0017 


-34 .2 


994 


UUIIICUUUA 


— c _ 

Homeobox domain 


4e-18 


73 . 6 


997 


J A A, CUUA 


ryriaine nucieotiae-aisuxpnide 

UAXUOlcUUCCa 


0 . 012 


11.6 


1000 




1,1 <- _/wiujiu._rxci_. earner procsins 


9.7e-123 


421.2 


1001 


RA 


xvao aoauLlatJ-Un ^ KaXoJJo / >vi? " b J 

domain 


1.2e-15 


65.4 


1004 


DUF81 


A/->iiict-Li4 uxiAiiowij lunction 
DUF81 


0.099 


10.2 


1005 


actin 


Actin 


1.3e-174 


574.3 


1006 


actin 


Actin 


3 .le-130 


428.6 


1007 


cpn60 TCP1 


TCP-l/cDn60 chsnpronin familv 


3.7e-195 


661.8 


1008 


TPR 


TPR Domain 


8 . le-44 


159.0 


1009 


zf-C2H2 


Zinc f incr^r" t-vm_» 


3.6e-61 216.6 


1011 


zf-C2H2 


Zinc finger, C2H2 type 


3.6e-61 


216.6 


1012 


Zf-C3HC4 


Zinc finoer hvnp /pTwn 
finger) 


4.7e-15 


53.1 


1016 


tRNA-synt 2c 


tRNA svnthetaapq rl-*«is TT fztl 


2.3e-15 


55.2 


101B 


RhoGAP 


RhoGAP domain 


1.6e-78 


274.3 


1022 


PGAM 


Phosphogly cerate mutase family 


3.8e-18 


69.7 


1026 


HMG box 


**uw iiixyii nuO-1 x y roup J DQX 


8.4e-20 


79.2 


1027 


TBC 


TBC domain 


7.3e-45 


162.5 


1028 


UQ con 


Ubiquit in- conjugating enzyme 


1.4e-49 


178.1 


1032 


PDZ 


cuit aomain iaiso Known as DHR 

ux oj_iv_rf i . 


0.028 


1^.3 


1034 


_**_ *_■ _. SFw 


naioacia aenaiogenase-liKe 
hydrolase 


2e-21 


84.6 


1037 


KRAB 


KRAB box 


4.8e-06 


32.4 


1038 


Cation_ef flu 

X 


Cation efflux family 


7.1e-42 


152.5 


1040 


ART 


NAD: arginine ADP- 
ribosyltransf erase 


4.7e-47 


169.1 


1042 


WD40 


WD domain, G-beta repeat 


I.9e-18 


74.7 


1043 


Z_-C2H2 


Zinc finger, C2H2 type 


3.7e-24 : 


93 .7 


1045 


lectin c 


Lectin C-type domain 


1.9e-28 


108 .0 


1046 


Glucosamine^ 

iBO j 


Glucosamine - 6 -phosphate 
i some rase 


0.00013 


-25.1 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1047 


ligase-CoA 


CoA-ligases 


4 . 5e-80 


279 . 4 


1049 


ig 


Immunoglobulin domain 


1.7e-09 


35.6 


1050 


Ribosomal L2 
4e 


Ribosomal protein L24e 


2e-33 


124 . 5 


1054 


Amidase 


Amidase 


4 . 3e-152 


518 . 7 


1055 


rrm 


RNA recognition motif. 


3 . 8e-26 


100 . 3 


1058 


annexm 


Annexin 


6 . 9e-44 


159 . 2 


1059 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


0 . 023 


-23 . 6 


1060 


homeobox 


Homeobox domain 


3 . 2e-3l 


117 .2 


1062 


Acyltransfer 
ase 


Acyltransf erase 


0.00065 


10.5 


1064 


AMP-binding 


AMP-binding enzyme 


6. 6e-100 


345 .3 


1065 


LRR 


Leucine Rich Repeat 


3 .3e-14 


60 . 6 


1066 


GTP1 OBG 


GTP1/OBG family 


4.8e-41 


141.8 


1071 


ig 


Immunoglobulin domain 


8 . 4e-48 


159 . 1 


1072 


PHD 


PHD- finger 


6 . Be-07 


36.3 


1074 


DENN 


DENN (AEX-3) domain 


8.3e-33 


121.5 


1075 


SCP 


SCP-like extracellular protein 


4 . 7e-41 


149 . 8 


1077 


OLF 


01 f actomedin-like domain 


2 .2e-66 


234.0 


1078 


raito carr 


Mitochondrial carrier proteins 


le- 42 




1079 


WD4 0 


WD domain, G-beta repeat 


6 .2e-45 


162.7 


1087 


START 


START domain 


x . De-fto 


X 1 *% . / 


1093 


DSPc 


Dual qripr i f~i c i t* v nhocir^Via h a «aca 

catalytic doma 


J . JC*DJ 


223.4 


1094 


GSHPx 


Gill 1 3 1" h l onp nprnYi' iSaqaq 


9 . 6e**41 


148 . 8 


1095 


DUF25 


Domai n OT JinlfTl<™iiim 'Fi irml- i r~\r\ 

ma. j. n \j x. wii iTwiiw wii xuhclxqii 
DUF25 


AC" /D 


264 . 0 


1096 


. DUF2 5 


Domain of iinlfnrtTjn f unnh { 
DUF25 


be- /b 




. 1105 


Nitroreducta 
se 


Nitroreductase family 


1 . 3e-13 


qfl "6 

DO.D 


1106 


PTE 


Phosphotriesterase family 


X . JC X / S 


O X U . X 


1107 


DAGKc 


Diacylglycerol kinase catalytic 
doma in 


0.00049 


19.6 


1109 


ras 


Ras family 


J. . Je-i3 


4 0.7 


1115 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


9.7e-47 


168.7 


1116 


HMG14_17 


HMG14 and HMG17 


t » 4C ii. 


83 « 5 


1117 


HMG14 17 


HMG14 and HMG17 


Q Qo 1 "i 




1119 


FAA hydrolas 
e 


Fumarylacetoacetate (FAA) 
hydrolase fam 


■c- e - o o 


290 . 6 


1120 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-94 


327.6 


1123 


abhydrolase 


alpha /beta hydrolase fold 


9.2e-23 


89.0 


1129 


pr o_i s ome r a s 
e 


Cyclophilin type peptidyl- 
prolyl cis-tr 


2 . 2e-56 


1 Q*7 1 


1131 


DnaJ 


DnaJ domain 


1.6e-30 


114.9 


1132 


WD40 


WD domain, G-beta repeat 


1 . 3e- 19 


78 6 


1133 


WD40 


WD domain, G-beta repeat 


1.8e-15 


64.9 


1134 


PH 


PH domain 


0 . 0015 


T *7 ft 
X / . O 


1136 


Adap comp su 
b 


Adaptor complexes medium 
subun i t f ami ly 


1 . 2e- 256 


OCD . U 


1137 


Adap comp su 
b 


Adaptor complexes medium 
subunit family 


2.5e-209 


708.8 


1139 


ras 


Ras family 


X . 3C" O O 


J U X . u 


1141 


pkinase 


Eukaryotic protein kinase 
domain 


9.4e-74 


258.4 


1152 


Acyltransfer 
ase 


Acyl transferase 


1.2e-05 


29.9 


1153 


IRS 


PTB domain (IRS-l.type) 


5.4e-55 


196.1 


1155 


ig 


Immunoglobulin domain 


1.3e-31 


106.9 


1157 


Asparaginase 
_2 


Asparaginase \ 


6.4e-72 


252.3 


1159 


GMC_oxred 


GMC oxidoreductases 


4.7e-142 


485.3 


1160 


zf-ANl 


ANl-like Zinc finger 


0.00021 


27.9 
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SBQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p value 


PFAM 
SCORE 


1163 


linker histo 
ne 


linker histone Hi and H5 family 


3 . 8e-14 


60 . 4 


1164 


DED 


Death effector domain 


3.9e-05 


30 . 5 


1165 


IRS 


PTB domain (IRS-1 type) 


2.6e-43 


157.3 


1166 


IRS 


PTB domain (IRS-1 type) 


2.6e-43 


157 . 3 


1168 


SAM 


SAM domain (Sterile alpha 
motif) 


0. 04 


10.5 


1170 


abhydrolase 


alpha/beta hydrolase fold 


0.098 


-7.5 


1174 


SAP 


SAP domain 


3.9e-10 


47 .1 


1177 


PP2C 


Protein phosphatase 2C 


5.3e-31 


112.5 


1178 


WD40 


WD domain, G-beta repeat 


4.7e-35 


129.9 


1180 


Ets 


Ets-domain 


1.8e-09 


33 .3 


1181 


Collagen 


Collagen triple helix repeat 
(20 copies) 


0 .00016 


24 . 7 


1182 


TCL1_MTCP1 


TCL1/MTCP1 family 


9.5e-56 


198.6 


1184 


RasGEF 


RasGEF domain 


1.7e-88 


307.4 


1185 


mito_carr 


Mitochondrial carrier proteins 


1.5e-62 


217.3 


1187 


UPAR LY6 


u — PAR / Isy - 6 domain 


0 . 0042 


15 . 6 


1188 


Orn_DAP_Arg_ 
deC 


Pyridoxal- dependent 
decarboxylase 


6.2e-128 


430.6 


1193 


Stathmin 


Stathmin family 






1194 


Stathmin 


Stathmin family 


1.8e-90 


314.0 


1195 


Seel 


Qppl f ami lv 




COO 1 

bZZ . ± 


1196 




•fytAciijie nucieociuc ~ ui suipniae 


3 . le-32 


111 . 8 


1197 


Glyco_transf 
8 


Glycosyl transferase family 8 


1.2e-09 


45.5 


1202 




aVi 1 let I Lllcl LCLiainciiDaLlOil 

domain 


u . uzz 


" la . o 


1203 


adh short 


short chain dehydrogenase 


8.3e-45 "" 


162 . 3 


1206 


Ubie methylt 
ran 


ubiE/C0Q5 methyl trans f erase 
family 


1 . 3e-121 


417 . 4 


1208 


7tm__3 


7 transmembrane receptor 


7 1 2e-09 


29 . 0 


1209 


ank 


Ank repeat 


3 ,9e-15 


63 . 7 


1210 


vATP- 
synt_AC3 9 


ATP synthase (C/AC39) subunit 


2.5e-128 


439.7 


iii2 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-17 


69 .9 


1213 


efhand 


EF hand 


3.2e-07 


37.4 


1219 


rrm 


RNA recognition motif". 


2 . le-40 


147 . 7 


1220 


DUF6 


Integral membrane protein DUF6 


0. 015 


21 .5 


1222 


SCAN 


SCAN domain 


1.5e-71 


251.1 


1223 


G- gamma 


GGL domain 


3 . 6e-36 


129 . 5 


1227 


catalase 


Catalase 


0 


1158.9 


1232 


PX 








1233 


PX 


PX domain 


2 . 2e-15 


64 . 5 


1236 


FCH 


Fes/CIP4 homology domain 


3 . 3e-09 


44.0 


1241 


Peptidase M2 
0 


Peptidase family M20/M25/M40 


2e-63 


224 . 1 


1243 


WW 


WW domain 


0 . 044 


17.9 


1247 | 


UPF0006 


Metalloenzyme of unknown 
function UPF0006 


6.3e-61 


215.8 


1248 


Glycos trans 
f_2 


Glycosyl transferases 


4.5e-10 


46.9 


1249 


efhand 


EF hand 


4e-ll 


50.4 


1254 


UQ con 


uwj.^uiLiii-njiijuyacing enzyTne 


z . le -■ 1 .} 


TCI 1 

Zo / . J 


1255 


ras 


Raq familv 
ivcio l. dulJ. X y 


z . ^e- bz 


4<2U . / 


1256 " 


formyl trans 
f 


Formyl t rans f era se 


4 . 9e-30 


108 .3 


1259 


zr-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-13 


46.4 


1261 


DiHfolate re 
d 


Dihydrofolate reductase 


2.1e-69 


241.7 


1262 


G_glu_transp 
ept 


Gamma -glutamyl transpeptidase 


1.8e-110 


380.4 


1263 


PAS 


PAS domain 


1.3e-08 


36.9 


12^5 


LRR 


Leucine Rich Repeat 


4 .2e-22 


86.9 
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SEQ ID 
NO ■ 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


1266 


SCP 


SCP-like extracellular protein 


6e-29 


108.0 


1267 


K_tetra 


K+ channel tetramerisation 
domain 


2.8e-27 


104.0 


1269 


ras 


Ras family 


1 . 3e-85 


297.9 


1 0*7 C 


ZX-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


4 .2e-10 


37. 0 




uv, t 

a b hy dr o 1 a s e 


alpha/beta hydrolase fold 


5 . 4e-23 


B9 . 8 


1277 


abhydrolase 


alpha/beta hydrolase fold 


5.6e-21 


B3.1 


1 O *7 Q 


trypsin 


Trypsin 


4 .4e-41 


132 . 0 


i ono 

Izh u 


PBP 


Phospha t i dy 1 e t hanolamine - 
binding protein 


1 . 3e-13 


58.7 


1 TOC 

i£03 


„ £ rt-j Tin/ 


£inc ringer, C3HC4 type (RING 


5. 6e-14 


49 . 6 




c 

ank 


Ank repeat 


1 . 7e-52 


187 . 8 


1294 


inj 


Fibronectin type III domain 


0 , 026 


20 . 9 






Guanylate -binding protein 


0 . 00026 


-70 . 0 


1296 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 

- 


6.9e-41 


149.3 


1 TOT 
JL ^ 3 / 


Rhodane s e 


Rhodane se- like domain 


3 . 2e-14 


60 . 7 


1298 


LIM 


LIM domain containing proteins 


5.8e-21 


79.1 




rnaseA 


Pancreatic ribonuc leases 


4 , 9e-43 


145 . 2 


1307 


raito_carr 


Mitochondrial carrier proteins 


2.1e-53 


186.0 


13 08 


WD40 


WD domain, G-beta repeat 


1 .6e-17 


71.6 


13 10 


UPAR_LY6 


u-PAR/Ly-6 domain 


7 ,le-20 


75.5 


1313 


thiored 


Thioredoxin 


3 .6e-05 


21.6 


1314 


Aa_trans 


Transmembrane amino acid 
transporter protein 


1.5e-67 


237.9 


13 16 


trypsin 


Trypsin 


4 . 4e-41 


132.0 


1320 


Ribosomal_ LI 
3 


Ribosomal protein L13 


3 .9e-62 


219.8 


1327 


Armadi 1 1 o_ s e 

g 


Armadillo/beta- catenin- like 
repeats 


0 .0054 


23.4 


1328 


KRAB 


KRAB box 


0 . 052 


-5.6 


1329 


rrm 


RNA recognition motif. 


2.1e-40 


147.7 


1330 


Bcl-2 


Apoptosis regulator proteins, 
Bcl-2 family 


0.014 


-1.6 


1331 


PX 


PX domain 


2.ie-l0 


48.0 


1333 


KRAB 


KRAB box 


1.8e-36 


134.6 


1334 


UPP_syntheta 
se 


Putative undecaprenyl 
diphosphate synt 


2.3e-89 


310.3 


1335 


UPP__syntheta 
se 


Putative undecaprenyl 
diphosphate synt 


1.8e-59 


211.0 


1336 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


1 . 2e-31 


118 . 6 




DoirC 


Dual specificity phosphatase, 
catalytic doma 


2 ,3e-12 


54 . 5 


133 B 




lifts, uomain 


U . 00021 


28 . 1 


134 0 


iiic uclx uxiXvJ 


inetcix J-O cm one in 


0 . 013 


20.3 


1341 


mutT 


Bacterial mutT protein 


5.8e-09 


36.5 


1 "\ A 1 
± J *± o 


Band 4 1 


rbKW aomam (Band 4 . l ramilyj 


1 . 3e-38 


122 . 5 


1344 


Kelch 


Kelch motif 


1 .4e-44 


161.5 




Antifreeze 


Antifreeze protein 


1.2e-10 


48 . 8 


t i a n 


JBeta HSD 


3 -beta hydroxys tero id 
dehydrogenase/ isomera 


0.086 


-177.2 


1348 


BTB 


BTB/POZ domain 


5.3e-28 


106.5 


1349 


DUF6 


Integral membrane protein DUF6 


0.033 


15.8 


1350 




nyuaxu iicau lliUjtOi QOiucl JL n J 


Q 




1352 


Nrarap 


Natural resistance-associated 
macrophage pro 


1.2e-202 


686.6 


1353 


SJLOO 


S-100/ICaBP type calcium 
binding domain 


5.3e-23 


89.9 


1355 


DEAD 


DEAD/DEAH box helicase 


3.6e-*5 


209.0 


1356 


C2 


C2 domain 


2.4e-15 


64.4 


1357 


RBD 


Raf-like Ras -binding domain 


4 .2e-57 


203 .1 


1360 


zf-C2H2 


Zinc finger, C2H2 type j 


7.4e-141 


481.4 


1361 


HMG14_17 


HMG14 and HMG17 


7.9e-40 


145.7 
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SBQ ID 
NO : 


'DPZVTvl MTV MIT 




p-value 

- 


PFAM 
SCORE 


13 6*2 


sis 


qtc ri remain 


3 . 8e-30 


113 . 6 


1363 


sis 


Old tuUUlCtXIl 


1 . 3e-28 


108 . 5 


1364 




Tniiwinnol fVhnl i n r)nma i n 
xuuuixiiuy luwui in uuuidJ.li 


U . 00026 


19 . 0 


1368 


K tetra 


v a. fViannpl t" Pt* ramp ri aah •» 

domain 


1 . le-lb 


68,9 


1371 


Collagen 


Collaaen triDle heliie renpat 
(20 copies) 




ion i 


1372 


DnaJ 


DnaJ domain 


O . DC - J O 


m n 
L5£ . 1 


1376 


KRAB 




"7 la IQ 
d> . J.C-J O 


141 . 0 


1378 


ELM2 


ELM 2 domain 


£>\2 — 4 3 


yi . j 


1380 


thiored 


Thioredoxin 


n Oo _ *5 1 


82 . 8 


1381 


ank 




& . je* oj 


290 . 4 


1382 


BTB 


RTR/PnTL noma ^ n 


3e-ll 


50 . 8 


1383 


WD40 


WO rioTTiA i n fl-V^^ha rranpah 


1 . 6e- 19 


78 . 3 


1384 


WD40 


riU uuillctxilf u UcCn IcpcdU 


6 . 3e*-24 


92 . 9 


1387 


zf - C3HC4 


xixiii, j-xjiycsx, t.oxi^'* type \kxiyij 

f 1 T"ltTS»T* ^ 

j. my ci y 


1 . le-09 


35.6 


1389 


zf -C2H? 


xi xi 11* j. j.iiy csx f \_ <s n>5 type 




179 . 5 


1390 


zf-C2H2 


Zinc finger, C2H2 type 


2.5e-85 


296.9 


1393 


lc i n fj t n 

rw.J-llC5t3J.il 


r\.j- « hi motor aoiuain 


7 . 8e-188 


637,4 


1394 




iiinc iiuycx, type 


1 . 2e-49 


178 . 4 


1398 


KRAB 


KRAB box 


5.1e-22 


86.6 






bZIP transcription factor 


0 . 035 


13 . 1 




sugar tr 


Sugar {and other) transporter 


0 . 003 


-101 .5 


i a nc 
Aft Ho 




RhoGAP domain 


8 . 9e-47 


168.8 


1407 


rrm 


RNA recognition motif. 


le-35 


132 . 1 


i Ann • 


T.DD 


Leucine Rich Repeat 


2 . le-13 


58 . 0 


Xf* U 17 


Nebulin repe 
at 


Nebulin repeat 


6e-54 


192 .6 


1410 




/ihk. repeat 


1 . 6e- 17 


71.6 


1412 




iiwuBomai bor iciiuiiy L-uernnnus 


8 . 2e-58 


205.5 


1415 




i typsin 


4 . 7e-85 


•270.4 


1416 


aminotran 1 


Aminotransferases class-I 


4.4e-05 


-91.2 


1417 


SI 


SI RNA binding domain 


1 . 6e-07 


33 . 1 


1419 


WD40 


WD domain, G-beta repeat 


2 . 2e- 09 


44 . 6 


1422 


LdUJ lt2 X 111 


Cadherin domain 


8 . 3e-42 


152 . 3 


1424 


SH3 


ori j aonidin 


2 . 5e-80 


280 .3 


1425 


PHD 


Fnij- jl inger 


3 . 2e-17 


70 . 6 


1426 




nun _ f J nrfO v 

rnu- ring er 


3 . 2e-17 


70 . 6 


1427 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


le-37 


138.8 


1428 


Via ~1 t 

ilciXCabc 


Helicases conserved C- terminal 

uuiuoiii 


le-26 


102 .2 


1429 


WD40 


WD domain, G-beta repeat 


3 . 9e-07 


37.2 


1430 


"1 nns i i~n1 P 

iliiuo,! jr 


liiubicoi itionopxiospiiac sse laniixy 


I. . oe-10 


4 0.2 


1431 


mit*rt parr 


ruLocnonariai carrier proneins 


** . je- oj 


287.7 


1433 


CI a 


v»xc[ uoniain 


2 . 9e-16 


66.2 


1434 


WD40 


riu tiuuidxii, vj-oeta repeat. 


1 Ccx 1 1 

1 . oe-lj 


58.3 


1435 


Inos-i~ 
P synth 


wyo-inoai toi-x-pnospnate 
«ivm 1" ha qp 

o y a Luaoc 


7„ 1 *5 Q 


770 . 4 


1436 


rrm 


tctuy iix Liun uiuLxx . 


1 Ap.lA 


TOO 1 
XZ O . J 


1438 


iq 


xiiuiiu(iu(j lujjui in uuiuain 


1 ~lc* -1 n 
± . Jc-li 


43.0 


1440 


G Adapt CT 


waii una - auap t xn, v.~termxnus 


1 A a (C7 

J .4c-o / 


zJo. / 


1441 


G Adaot CT 


3 mnlf) ^ H ^ y*\ ^* l Z™ 1 _ K A ww q nun 

odiiiuici - ci(_icip t in , t-teniiiuus 


J . fie - D / 


ZJO . / 


1443 


Kelch 


KaI rh mot* ■! F 


U . www 1j 


no *7 
« 0 • f 


1446 


ARID 


ARID DNA binding domain 


1 . 8e-21 


84 . 7 


1447 


zf-C2H2 


Zinc finger, C2H2 type 


9.4e-28 


105.6 


1448 


AMP-binding 


AMP-binding enzyme 


2.6e-07 


-145.1 


1451 


rrm 


RNA recognition motif. 


6.5e-2l 


82.9 


1454 


ig 


Immunoglobulin domain 


5.6e-44 


146.7 


1455 


Sialyltransf 


Sialyltransferase family 


5.4e-21 


83 .2 


1460 


Aldose_epim 


Aldose 1-epimerase 


1.9e-35 


131.2 


1461 


C2 


C2 domain 


4e-18 


73.6 


1470 


TIG 


IPT/TIG domain 


3.1e-19 


77.3 


1472 


PseudoU_synt 


RNA pseudouridylate synthase 


4.3e-16 


66.9 
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SEQ ID 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 




h 9 








1474 


DENN 


DENN (AEX-3) domain 


1.3e-44 


161.6 




Cation ef flu 

X 


Cation efflux family 


4 . fie -49 


176 . 4 


i a in 
13 / / 


lot 


TBC domain 


8e-47 


169 . 0 


1478 


rrm 


RNA recognition motif. 


2e-21 


84.6 


i a an 


!9 


Immunog 1 obu 1 i n doma i n 


5 . 5e-06 


24 .3 


1484 


Telo_bind_al 
pna 


Telomere -binding protein alpha 
subuni 


0.028 


-225.9 


1485 


zf-C2H2 


Zinc finger, C2H2 type 


1.8e-68 


240.9 


1486 


pkinase 


Eukaryotic protein kinase 
domain 


9 . 5e-13 


49 .9 


1488 


helicase_C 


Helicases conserved C- terminal 
domain 


1.4e-15 


65.2 


1489 


DUF89 


Protein of unknown function 
DUF89 


0.079 


-132.4 


1490 


ECH 

, .. _ 


Enoyl-CoA hydratase/isomerase 
family 


5 . 2e-41 


149 . 7 


1491 


guanylate cy 

c 


Adenylate and Guanylate cyclase 
catalyt 


5 . 9e-46 


166 . 1 


1492 


LRR 


Leucine Rich Repeat 


3.4e-19 


77.2 


1495 


ZI -C3HU4 


Zinc ringer, C3HC4 type (RING 
finger) 


7 . le-10 


36.3 


1497 


pkinase 


Eukaryotic protein kinase 
domain 


le-22 


85 . 8 


1500 


CUT 


SH3 domain 


9 . 3e~05 


27.2 


1502 


hon\eobox 


Homeobox domain 


0.034 


13 . 8 


1503 


homeobox 


Homeobox domain 


0. 084 


13.8 


1505 


EGF 


EGF- like domain 


2 . 7e-23 


90 . 8 


1506 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


2 . 7e-21 


84 . 2 


1508 


Peptidase_M2 
0 


Peptidase family M20/M25/M40 


2 . 8e-28 


101. 8 


1511 


PX 


PX domain 


1 . 9e-ll 


51 . 5 


1512 


Sulf atase 


Sulf atase 


2 . 8e-35 


130 . 7 


1516 


Syntaxin 


Syntaxin 


0 . 011 


-62.3 


151B 


aminotran_3 


Aminotransferases class- III 
pyridoxal -pho 


9.7e-106 


305.6 


1520 




Immunoglobulin domain 


0 . 075 


11.0 




Da 


Ras association (RalGDS/AF-6) 
domain 


0 . 013 


13 . 3 




KilObAr 


RhoGAP domain 


d. . be- Ub 


18.7 


1528 


WD4 0 


WD domain, G-beta repeat 


5.4e-24 


93.1 


lb J b 


1MB 


impB/mucB/samB family 


7 . 8e-95 


328.5 


lb J o 


r x VEi 


FYVE zinc finger 


3 . 2e-27 


101.5 


1539 


DAGKc 


Diacyiglycerol kinase catalytic 
domain 


6e- 07 


36.5 


1540 


Ocular alb 


Ocular albinism type 1 protein 


0 


1184 , 7 


It) bo 


GAD 

b/Ur 


bAr aomam 


6e-06 


33 . 2 


1 C A 

lo b4 


Amino oxidas 

e 


Flavin containing amine oxidase 


3 . 2e- 43 


157 . 0 




Amino oxidas 


Flavin containing amine oxidase 


3 . 2e- 43 


157.0 


1656 




t\iL\j\jctc domain 


T At> T A 


33.1 


1657 


MMR_HSR1 


GTPase of unknown function 


0.0011 


-45.5 


lb by 


UUrl- .£ 


Ubiquitin carboxyl - terminal 

hvdrolasp fam^ 1 \r 


2 . 5e-ll 


51 . 1 


1660 


actin 


Actin 


6.6e-21 


69.9 


1661 


BAH 


BAH domain 


1.7e-82 


287.5 


1662 


vwa 


von Willebrand factor type A 
domain 


0 


1909.4 


1663 


WD40 


WD domain, G-beta repeat 


1.4e-67 


237.9 


1667 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-93 


324.4 


1669 


Noll_Nop2_Su 
n 


N0Ll/NOP2/sun family 


1.3e-23 


84.3 


1671 


SH2 


Src homology domain 2 


5.4e-l5 


46 .9 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p— value 




1672 


chromo 


'chromo' (CHRromatin 
Organization Modifier) 


2 . le-lS 


67.7 


1674 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


0.0025 


17 . 6 


1676 


Glyco_hydro 
47 


Glycosyl hydrolase family 47 


1.8e-187 


636.2 


1677 


Glyco_hydro 
47 


Glycosyl hydrolase family 47 


4.5e-74 


259.5 


1680 


WD40 


WD domain, G-beta repeat 


l.le-27 


105.5 


1681 


WD40 


WD domain, G-beta repeat 


l.le-27 


105 . 5 


1683 


MMR_HSR1 


GTPase of unknown function 


1 . 8e-78 


274 . 1 


1691 


rrm 


RNA recognition motif. 


1 . 8e-37 


137 . 9 


1692 


rrm 


RNA recognition motif. 


1.8e-37 


137 . 9 


1*93 


AAA 


ATPases associated with various 
cellular act 


1 ,3e-81 


284 . 5 


1697 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 


8 .4e-82 


285.2' 


1698 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 


3 .5e-53 


190 . 1 


1699 


zf-C2K2 


Zinc finger, C2H2 type 


4 . 4e-34 


126 . 6 


1700 


arf 


ADP-ribosylation factor family 


9e-19 


75.8 


1702 


GTP_EFTU 


Elongation factor Tu family 


0 . 014 


11 . 4 


1703 


SCAN 


SCAN domain 


1.8e-54 


194 .4 


1707 


pkinase 


Eukaryotic protein kinase 
domain 


1 . 2e-88 


307 . 9 


1709 


WD4Q 


WD domain, G-beta repeat 


0 . D035 


24 . 0 


1710 


LRR 


Leucine Rich Repeat 


1 . 2e-30 


115 . 3 


1711 


WW 


WW domain 


7.6e-12 


52.8 


1712 


ank 


Ank repeat 


4 . 2e-34 


126 , 7 


1713 


zf -CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2.6e-09 


38.3 


1714 


zf-CCCH 


Zinc finger C-x8 -C-x5-C-x3 -H 
type 


2.6e-09 


38.3 


1715 


ras 


Ras family 


4 . 4e-41 


149 . 9 


1718 


HMG_box 


HMG (high mobility group) box 


8 . 3e-21 


82 . 6 


1719 


TBC 


TBC domain 


1 . le-45 


165 . 2 


1721 


HLH 


Helix- loop-helix DNA-binding 
domain 


9.2e-10 


45.9 


1723 


dsrm 


Double- stranded RNA binding 
motif 


2 . 9e-05 


30.9 


1724 


RrnaAD 


Ribosomal RNA adenine 
dimethyl ases 


0.045 


9.2 


1725 


CIDE-N 


CIDE-N domain 


5 . 9e-40 


146 . 2 


1726 


HAT 


HAT (Half -A-TPR) repeats 


2 . 9e-44 


160 . 5 


1728 


efhand 


EF hand 


5 . le-20 


79 . 9 


1733 


Hist deacety 
1 


His tone deacetylase family 


1.7e-104 


360.6 


1735 


LRR 


Leucine Rich Repeat 


4 .6e-34 


12$.6 


1739 


PI-PLC-X 


Phospha t i dyl inos itol-specific 
phospholipase 


0.0023 


16. 1 


1743 


ras 


Ras family 


3.7e-10 


-21.3 


1744 


ras 


Ras family 


3.7e-10 


-21.3 


1745 


RasGEF 


RasGEF domain 


3.2e-49 


176.9 


1746 


adh_short 


short chain dehydrogenase 


7.1e-08 


34.6 


1751 


zf-C2H2 


Zinc finger, C2H2 type 


9e-.39 


142.2 


1754 


rn3 


Fibronectin type III domain 


5.5e-l01 


348.9 


1756 


z£-C2H2 


Zinc finger, C2E2 type 


6.3e-93 


322.1 


1758 


rrm 


RNA recognition motif. 


0.017 [ 


21.2 


1760 


Nop I 


Putative snoRNA binding domain 


6.1e-95 


328.8 


1761 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.8 


1765 


MMR_HSR1 


GTPase of unknown function 


6.4e-41 


149.4 


1769 


CN_hydrolase 


Carbon -nitrogen hydrolase 


3e-06 


-43.9 


1775 


ank 


Ank repeat 


4.1e-07 


37.1 


1779 


OxysterolJBP 


Oxysterol -binding protein 


4.7e-56 


199.6 


1783 


RhoGEF 


RhoGEF domain 


1.6e-23 


91.6 


1784 


RhoGEF 


RhoGEF domain 


1.6e-23 


91.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1785 


rrm 


RNA recognition motif. 


6. 4e-14 


S9.7 



TRADOCS: 1 4 1 6227. 1 (%CRN0 ! LDOC) 
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TABLE 5 



SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1 


1-21 


0.991 


0.955 


2 


1-31 


0.995 


0 .944 


3 


1-33 


0.949 


0 .736 


4 


1-19 


0.970 


0.951 


5 


1-26 


0.971 


0.863 


6 


1-26 


0.971 


0 .863 


7 


1-26 


0.971 


0.863 


8 


1-26 


0.971 


0.8*3 


9 


1-46 


0.982 


0.901 | 


10 


1-21 


0.991 


0 .955 


11 


1-23 


0.989 


0 .899 


12 


1-25 


0.955 


0 .803 


13 


1-18 


0 .932 


0 .625 


14 


1-18 


0 .938 


0 . 876 


15 


1-25 


0.941 


0 .811 


16 


1-17 


0.972 


0 .939 


17 


1-27 


0.964 


0 .777 


18 


1-16 


0.914 


0 .657 


19 


1-19 


0 . 953 


0 . 840 


20 


1-20 


0.935 


0 .701 


21 


1-22 


0.974 


0 . 850 


22 


1-33 


0.961 


0 .895 


23 


1-19 


0 . 991 


0.959 


24 


1-31 


0 .995 


0 . 944 


25 


1-22 


0.976 


0 . 935 


26 


1-27 


0 . 996 


0 . 928 


27 


1-24 


0.953 


0 .739 


28 


1-21 


0.906 


0 .688 


29 


1-31 


0.986 


0 . 841 


30 


1-28 


0. 980 


0 .893 


31 


1-19 


0.993 


0 .976 


32 


1-22 


0.998 


0 . 909 


35 


1-33 


0 . 949 


0 .736 


36 


1-33 


0 . 949 


0 .736 


46 


1-19 


0. 970 


0 .951 


67 


1-25 


0.968 


0 .848 


71 


1-18 


0.949 


0 .845 


72 


1-30 


0.991 


0 . 919 


75 


1-29 


0.958 


0.854 


88 


1-20 


0.986 


0.945 


94 


1-33 


0 . 994 


0 . 943 


97 


1-46 


0.964 


0 .595 


103 


1-49 


0.983 


0.570 


108 


1-26 


0.978 


0.885 


111 


1-23 


0.989 


0.899 


126 


1-25 


0 . 955 


0.803 


129 


1-19 


0.963 


0.918 


138 


1-29 


0.971 


0.844 


143 


1-18 


0.914 


0. 628 


148 


1-20 


0.969 


0 .904 


156 


1-25 


0.941 


0 . 811 


158 


1-22 


0.979 


0.927 


160 


1-17 


0.972 


0.939 


161 


1-48 


0.903 


0.571 


162 


1-25 


0.937 


0.729 


168 


1-16 


0.939 


0.826 


171 


1-27 


0.964 


0.777 


17B 


1-21 


0.945 


0.825 


180 | 


1-27 


0.981 


0.941 


187 


1-28 


0.982 


0.936 


190 


1-19 


0.953 


0.840 


196 


1-22 


0.975 


0.916 


197 


1-22 


0.9*3 


0.936 
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SEQ ID NO: 


PO<3TTTON OF 

SIGNAL IN AMINO 
ACID SEQUENCE 


Max^ fMAXTMITM 

l lOAJ \ ri/UV J. 11 \J 11 

SCORE) 


SCORE) 


199 


1-20 


0.935 


0 . 701 


200 


1-23 


0.977 


0 . 773 


206 


1-30 


0.984 


0 .890 


207 


1-19 


0.990 


0 . 924 


208 


1-22 


0.974 


0.850 


210 


1-40 


0.940 


0 . 670 


211 


1-2B 


0.971 


0 .849 


216 


1-24 


0.986 . 


0.956 


218 


1-33 


0.961 


0.895 


219 


1-19 


0.970 


0 . 871 


221 


1-19 


0.904 


0.553 1 


222 


1-21 


0 .917 


0.555 


230 


1-19 


0.991 


0 .959 


231 


1-26 


0.953 


0 .800 


232 


1-25 ■ 


0.988 


0 . 826 


239 


1-23 


0 .969 


0 . 828 


240 


1-17 


0 . 982 


0 . 955 


241 


1-17 


0 . 982 


0 . 955 


245 


1-30 


0.970 


0.722 


248 


1-22 


0.976 - 


0 . 935 


249 


1-23 


0 . 968 


0 . 940 


252 


1-18 


0 .971 


0 . 923 


261 


1-24 


0.883 


0 . 587 


265 


1-18 


0.939 


0.868 


272 


1-24 


0 . 953 


0 73 9 


283 


1-21 


0 . 906 


0.688 


284 


1-29 


0 .997 


0 . 854 


290 


1-31 


0.986 


0.841 


302 


1-2 8 


0.980 


0 . 893 


304 


1-16 


0 . 907 


0.635 


312 


1-19 


0 . 993 


0 . 976 


313 


1-17 


0 . 930 


0 . 753 


323 


1-22 


0 . 998 


0.909 


324 


1-17 


0 . 982 


0 . 954 


328 


1-19 


0 .971 


0 . 865 


329 


1-22 


0 .963 


0.924 


330 


1-33 


0 . 978 


0 . 841 


331 


1-24 


0 . 920 


0 . 712 


332 


1-24 


0.975 


0.881 


333 


.1-19 


0.984 


0 . 941 


334 


1-20 


0. 899 


0 . 567 


335 


1-27 


0 . 942 


0 . 813 


336 


1-20 


0.952 


0.850 


337 


1-38 


0.942 


0.653 


338 


1-27 


0.973 


0 . 772 


339 


1-36 


0. 979 


0 . 804 


340 


1-27 


0.888 


0.597 


343 


1-19 


0.971 


0 . 865 


344 


1-22 


0 .994 


0 . 928 


345 


1-17 


0.966 


0 . 687 


346 


1-19 


0.936 


0 . 822 


347 


1-22 


0.963 


0 . 924 


349 


1-24 


0 . 982 


0 . 966 


351 


1-21 


0 . 918 


0 . 815 


352 


1-31 


0.988 


0 . 912 


354 


1-31 


0.974 


0. 839 


355 


1-29 


0.932 


0.632 


356 


1-15 


0.994 


0.969 


357 


1-33 


0.935 


0.726 


360 


1-27 


0.938 


0.827 


361 


1-25 


0.954 


0.674 


362 


1-22 


0.929 


0.788 


363 


1-21 


0.881 


0.715 


364 


1-33 


0.978 


0.841 


365 


1-33 


0.978 


0.841 
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SEQ 10 NO: 


PO<5TTTON OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS ( MAXIMUM 
SCORE) 


l J C a. i to \ nEt£\vt 

SCORE) 


366 


1-21 


0.916 


0 . 820 


367 


1-19 


0 .936 


0.822 


368 


1-29 


0.972 


0.874 


370 


1-24 


0 . 920 


0 .712 


371 


1-24 


0.961 


0.773 " 


372 


1-27 


0.919 


0 . 768 


373 


1-19 


0.986 


0.945 


375 


1-32 


0 .994 


0.932 


376 


1-34 


0 .987 


0 .810 


377 


1-17 


0.995 


0 .950 


378 


1-49 


0 .971 


0 .749 


380 


1-20 


0.968 


0 .874 


381 


1-20 


0 . 928 


0.782 


382 


1-19 


0 .986 


0 . 934 


383 


1-28 


0 .965 


0 . 829 


384 


1-39 


0 .970 


0 .551 


386 


1-24 


0.975 


0.881 


388 


1-30 


0 . 989 


0.868 


389 


1-19 


0 .984 


0 . 941 


390 


1-26 


0 . 971 


0 . 782 


392 


1-20 


0.981 


0.900 


393 


1-16 


0.968 


0.890 


394 


1-23 


0 . 937 


0 . 701 


397 


1-22 


0 . 985 


0 . 854 


399 


1-46 


0 . 977 


0 . 698 


401 


1-20 


0 . 899 


0 . 567 


402 


1-22 


0 . 967 


0 . 931 


403 


1-27 


0 . 992 


0 . 934 


404 


1-19 


0 . 991 


0 . 973 


405 


1-23 


0 . 994 


0 . 921 1 


407 


1-35 


0.987 


0.658 


408 


1-3 9 


0.976 


U . ->->JL 


409 


1-33 


0 . 897 


0.570 


410 


1-25 


0 . 990 


0.962 


411 


1-38 


0 . 977 


0.827 


412 


1-20 


0 . 944 


0.768 


413 


1-20 


0.988 


0.965 


414 


1-46 


0 . 993 


0.638 


415 


1-23 


0 . 981 


0.940 


417 


1-29 


0 . 941 


0 . 672 


418 


1-20 


0 . 952 


0.850 


419 


1-19 


0.986 


0.967 


420 


1-29 


0.965 


0.861 


421 


1-22 


0 . 889 


0.785 


422 1 


1-48 


0 .982 


0 . 862 


424 


1-19 


0. 979 


0 . 933 


428 


1-38 


0 . 942 


0 . 653 


430 


1-18 


0 . 947 


0.^95 


432 


1-33 


0.957 


0.789 


433 


1-26 


0 . 979 


0 . 904 


434 


1-27 


0 . 962 


0 . 777 


435 


1-24 


0. 998 


0.977 


436 


1-27 


0 . 973 


0 .772 


443 { 


1-15 


0 . 966 


0.940 


448 


1-36 


0 . 979 


0 . 804 


453 


1-41 


0.958 


0 .609 


455 


1-33 


0.943 


0.606 


457 


1-27 


0.888 


0.597 


462 


1-16 


0.925 


0.681 


486 


1-27 


0.972 


0.845 


495 


1-24 


0.917 


0.636 


498 


1-26 


0.993 


0.890 


505 


1-20 


0.976 


0.926 


507 


1-17 


0.966 


0.687 


510 


1-23 


0.930 


0.593 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


511 


1-23 


0.930 


0.593 


512 


1-23 


0.930 


0.593 


515 


1-18 


0.978 


0.956 


523 


1-19 


0.936 


0.822 


529 


1-22 


0.963 


0.924 


545 


1-24 


0.982 


0.966 


550 


1-30 


0.933 


0.713 


552 


1-21 


0.973 


0.912 


554 


1-23 


0.969 


0.784 


571 


1-21 


0.918 


0.815 


574 


1-31 


0.988 


0.912 


580 


1-39 


0.925 


0.556 


594 


1-31 


0.974 


0.839 


608 


1-29 


0.932 


0.632 


609 


1-29 


0.932 


0.632 


610 


1-21 


0,990 


0 .948 


621 


1-15 


0.994 


0.969 


623 


1-33 


0.93 5 


0. 726 


653 


1-27 


0.938 


0.827 


668 


1-22 


0.929 


0.788 


677 


1-16 


0.948 


0 . 807 


685 


1-21 


0.881 


0.715 


699 


1-22 


0.975 


0. 816 


702 


1-31 


0.968 


0. 898 


707 


1-16 


0 . 880 


0 . 562 


713 


1-25 


0.966 


0. 743 


718 


1-19 


0.936 


0 . 822 


719 


1-20 


0.961 


0. 824 


729 


1-29 


0.972 


0. 874 


735 


1-46 


0.903 


0.598 


746 


1-14 


0.916 


0. 730 


747 


1-22 


0.965 


0.876 


748 


1-29 


0.968 


0 . 785 


759 


1-24 


0.961 


0.773 


767 


1-27 


0.919 


0.768 


768 


1-33 


0.900 


0.585 


773 


1-42 


0.959 


0.702 


779 


1-19 


0.986 


0.945 


797 


1-19 


0.944 


0.759 


798 


1-19 


0.900 


0.S68 


820 


1-17 


0.995 


0.950 


827 


1-49 


0.971 


0.749 


848 


1-20 


0.968 


0.874 


864 


1-20 


0.928 


0.782 


866 


1-19 


0.986 


0. 934 


873 


1-23 


0.948 


0.886 


881 


1-28 


0.965 


0. 829 


887 


1-39 


0.970 


0.551 


927 


1-30 


0.989 


0.868 


934 


1-48 


0.988 


0.777 


939 


1-39 


0.994 


0.889 


944 


1-26 


0.971 


0.782 


950 


1-29 


0. 957 


0 . 845 


963 


1-20 


0.981 


0.900 


964 


1-20 


0.886 


0.558 


973 


1-16 


0.968 


0.890 


980 


1-34 


0.961 


0.749 


981 


1-20 


0.953 


0.B22 


984 


1-12 


0.938 


0.780 


1015 


1-22 


0.985 


0.854 


1040 


1-46 


0.977 


0.698 


1052 


1-18 


0.969 


0.842 


1059 


1-20 


0 .927 


0.867 


1065 


1-33 


0.983 


0.918 


1069 


1-22 


0.993 


6.935 
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SEQ ID N6: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1075 


1-27 


0.992 


0.934 


1080 


1-19 


0.931 


0.829 


1092 


1-19 


0.991 


0.973 


1094 


1-46 


0.992 


0.653 


1095 


1-30 


0.974 


0.929 


1105 


1-23 


0.994 


0.921 


1123 


1-35 


0.987 


0.658 


1138 


1-32 


0.954 


0.613 


1140 


1-38 


0.989 


0.789 


1142 


1-33 


0.897 


0.570 


1152 


1-25 


0.990 


0.962 


1170 


1-38 


0.977 


0.827 


1176 


1-20 


0.944 


0.768 


1187 


1-20 


0 .988 


0.965 


1189 


"1-35 


0.967 


0.839 


1192 


1-46 


0.993 


0.638 


1193 


1-16 


0.925 


0.710 


1197 


1-29 


0.985 


0.853 


1208 


1-23 


0.981 


0.940 


1225 


1-29 


0.941 


0. 672 


1245 


1-19 


0.986 


0.967 


1258 


1-29 


0.965 


0 . 861 


1265 


1-22 


0.889 


0.785 


1266 


1-20 


0.944 


0 . 809 


1276 


1-48 


0.982 


0 . 862 


1292 


1-19 


0.979 


0 .933 


1296 


1-21 


0.984 


0 . 944 


1297 


1-19 


0.984 


0.953 


1332 


1-38 


0.942 


0.653 


1358 


1-18 


0.947 


0.595 


1371 


1-33 


0.957 


0.789 


1380 


1-26 


0.979 


0 .904 


1397 


1-27 


0.962 


0.777 


1399 


1-23 


0.937 


0 .960 


1404 


1-24 


0.998 


0 .977 


1410 


1-15 


0.946 


0 . 845 


1414 


1-24 


0.913 


0 .588 


1415 


1-19 


0.982 


0 .929 


1416 


1-12 


0.931 


0 . 891 


1418 


1-30 


0.933 


0.5^3 


1420 


1-20 


0.881 


0 . 561 


1421 


1-19 


0.990 


0.968 


1423 


1-17 


0.968 


0.863 


1424 


1-21 


0.885 


0.591 


1425 


1-24 


0.913 


0.588 


1426 


1-24 


0.913 


0.588 


1428 


1-25 


0.967 


0.899 


143 0 


1-34 


0.977 


0.819 


1431 


1-28 


0.979 


0.923 


1432 


"1-3* 


0.957 


0.613 


1433 


1-32 


0.921 


0 .753 


1434 


1-39 


0.983 


0 . 621 


1435 


1-25 


0.910 


0.631 | 


"1436 


1-42 


0.988 


0 .868 


1437 


1-22 


0.998 


0.980 


1442 


1-20 


0.918 


0.753 


1448 


1-12 


0.931 


0.891 


1462 


1-18 


0.968 1 


0.888 


1490 


1-20 


0.881 


0.561 


1518 


1-17 


0.968 


0 .863 


1525 


1-21 


0.885 


0.591 


1547 


1-28 


0 .974 


0.891 


1561 


1-25 


0.967 


0.899 


1580 


1-17 


0.923 


0.824 


1593 


1-28 ■ - 


0.979 


0.923 
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SBQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1596 


1-16 


0.929 


0.709 " - 


1601 


1-36 


0.957 


0.613 


1606 


1-22 


0.979 


0.831 


1607 


1-20 


0.974 


0.770 


1608 


1-32 


0.921 


0.753 


1614 


1-33 


0.969 


0.829 




1-20 


0.959 


0.869 


1625 


1-39 


0.983 


0.621 


1632 


1-25 


0.910 


0.631 


1636 


1-33 


0.897 


0.591 


1^39 


1-42 


0.988 


0.868 


1645 


1-20 


0.927 


0.566 


1647 


1-17 


0.923 


0.742 


1648 


1-22 


0.998 


0.980 



TRADOCS: 1 4 1 6234, 1 (%CR%01 ! .DOC) 
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TABLE 6 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: Of 


of contig 


NO: 


docket number^ 


NO : in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




1 


1787 


3573 


5359 


784CIP2__1 


1103 


2 


1788 


3574 


5360 


784CIP2 2 


2673 


3 


1789 


357.5 


5361 


784CIP2_3 


4117 


4 


1790 


3576 


5362 


784CIP2_4 


5556 


5 


1791 


3577 


5363 


784CIP2_5 


5562 


6 


1792 


3578 


5364 


784CIP2_6 


5562 


7 


1793 


3579 


5365 


784CIP2_7 


5562 


8 


1794 


3580 


5366 


784CIP2_8 


5562 


9 


1795 


3581 


5367 


784CIP2_9 


5563 


10 


1796 


3582 


5368 


784CIP2_10 


5564 


11 


1797 


3583 


5369 


784CIP2JL1 


5565 


12 


1798 


3584 


5370 


784CIP2JL2 


5689 


13 


1799 


3585 


I 5371 


784CIP2_13 


5729 


14 


1800 


3586 


5372 


784CIP2_14 


5745 


15 


1801 


3587 


5373 


784CIP2_15 


5777 


16 


1802 


3588 


5374 


784CIP2JL6 


5777 


17 


1803 


3589 


5375 


784CIP2_17 


5789 


18 


1804 


i_ 3590 


5376 


784CIP2_18 


5792 


19 


1805 


3591 


5377 


784CIP2_19 


5804 


20 


1806 


3592 


5378 


784CIP2_20 


5805 


21 


1807 


3593 


5379 


784CIP2_21 


5805 


22 


1808 


3594 


5380 


784CIP2_22 


5844 


23 


1809 


3595 


5381 


784CIP2_23 


5844 


24 


1810 


3596 


5382 


784CIP2_24 


5850 


25 


1811 


3597 


5383 


784CIP2J25 


5867 


26 


1812 


3598 


5384 


784CIP2_26 


5973 


27 


1813 


3599 


5385 


784CIP2_27 


5995 


28 


1814 


3600 


5386 


784CIP2_28 


5995 


29 


1815 


3601 


5387 


784CIP2J29 


6005 


30 


1816 


3602 


5388 


784CIP2_30 


6007 


31 


1817" 


3603 


53B9 


784CIP2_31 


6007 


32 


1818 


3604 


5390 


784CIP2_32 


6009 


33 


1819 


3605 


5391 


784CIP2_33 


6012 


34 


1820 


3606 


5392 


784CIP2_34 


6015 


35 


1821 


3607 


5393 


784CIP2_35 


6016 


36 


1822 


3608 


5394 


784CIP2 36 


6016 


37 


1823 


3609 


5395 


784CIP2 37 


6018 


38 


1824 


3610 


5396 


784CIP2_38 


6018 


39 


1825 


3611 


5397 


784CIP2_39 


6018 


40 


1826 


3612 


5398 


7B4CIP2_40 


6023 


41 


1827 


3613 


5399 


784CIP2_41 


60 70 


42 


1828 


3614 


5400 


784CIP2_42 


6081 


43 


1829 


3615 


5401 


784CIP2_43 


6089 


44 


1830 


3616 


5402 


784CIP2_44 


6118 


45 


1831 


3617 


5403 


784CIP2_45 


6118 


46 


1832 


3618 


5404 


784CIP2 46 


6130 


47 


1833 


3619 


5405 


784CIP2 47 


6177 


48 


1834 


3620 


5406 


784CIP2_4B 


6189 


49 


1835 


3621 


5407 


784CIP2 49 


6191 


50 


1836 


3622 


5408 


784CIP2_50 


6204 


51 


1837 


3623 


5409 


784CIP2_51 - 


6204 


52 


1838 


3624 


5410 


784CIP2_52 


6284 


53 


1839 


3625 


5411 


784CIP2__53 


6367 


54 


1840 


3626 


5412 


784CIP2_54 


£436 


55 


1841 


3627 


5413 


784CIP2_55 


6442 


56 


1842 


3628 


5414 


784CIP2_56 


6445 


57 


1843 


3629 


5415 


784CIP2_57 


6457 


58 


1844 


3630 


5416 


784CIP2_58 


6458 


59 


1845 


3631 


5417 


784CIP2 59 


£458 



271 



WO 01/53312 



PCTAJS00/34263 



cr?o TD NO • 
way xl/ iiv • 

of full- 
length 
nucleotide 


&t!j\2 ID 
NO: of 
full- 
length 


iDtnj ±u inu : 
of r" on t* "5 f3 

nucleotide 
sequence 


SEQ ID 
NO : 

of rnnt" io 

peptide 


Priority 
uuc/te c nuuioer 

i-ui tcijpujiQiny 
SEO ID NO- in 


SEQ ID 
NO: in 

AQ//DO IOC 


sequence 


peptide 
sequence 




sequence 


priority 
application 




60 


1846 


3632 


1 5418 


784CIP2 60 


6462 


61 


1847 


3633 


5419 


784CIP2 61 


6472 


62 


1848 


3634 


5420 


784CIP2 62 


6499 


63 


1849 


3635 


5421 


784CIP2 63 


1 6499 


64 


1850 


3636 


5422 


784CIP2 64 


6505 


65 


1851 


3637 


5423 


784CIP2 65 


6534 " 


66 


1852 


3638 


5424 


784CIP2 66 


6534 


67 


IB 53 


3639 ™ 


S42S 


784CIP2 67 


6540 


68 


1B54 


3640 


5426 


784CIP2 68 


6550 


69 


1B55 


3641 


5427 


784CIP2 69 


6550 


70 


1856 


3642 


5428 


784CIP2 70 


6592 


71 


1857 


3643 


5429 


784CIP2 71 


6645 


72 


1858 


3644 


5430 


784CIP2 72 


G71 
DO (J, 


73 


1859 


3645 


5431 


784CIP2 1~\ 


6763 


74 


1860 


3646 


5432 


784CTP2 74 


6763 


75 


1861 


3647 


5433 


784C1P2 75 


678 6 1 


76 


1862 


3648 


5434 


784CIP2 7fi 


D Oil 


77 


1863 


3649 


5435 


784CIP2 77 


683 0 


78 


1864 


3650 


5436 


7RdTTP? 7ft 


con 
bo j J. 


79 


1865 


3651 


5437 


784CTP2 7q 


C R7 2 


80 


1866 


3652 


5438 


784CIP2 an 


6834 


81 


1867 


3653 


543 9 


7R4CTP2 fll 




82 


1868 


3654 


544 0 


784CTP2 R2 


6835 


83 


1869 


3655 


5441 


784CIP2 83 


6837 


84 


1870 


| 3656 


5442 


784CIP2 84 


6843 


85 


1871 


i 3657 


5443 


784CTP2 8*5 




86 


1872 


3658 


5444 


784CIP2 8fi 


6915 


87 


1873 


3659 


5445 


784CTP2 H7 


6932 


88 


1874 


3660 


5446 


784CTP2 RR 


OJD / 


89 


1875 


3661 


5447 


784CIP2 89 


6961 


90 


1876 


3662 


5448 


784CIP2 90 


6973 


91 


1877 


3663 


5449 


7B4CTP2 91 


6973 


92 


1878 


3664 


5450 


784CiP2 95 


7007 


93 


1879 


3665 


5451 


7B4CTP2 94 


rvlQ 


94 


1880 


3666 


5452 


7R4CTP2 9^ 




95 


1881 


3667 


5453 


7R4CTP2 9fi 


"702 0 


96 


1882 


3668 


5454 


784CIP2" "97 


702 ft 


97 


1883 


3669 


5455" " 


7B4CfP2~~9fl 




98 


1884 


3670 


5456 


784CTP2 99 


702T 


99 


1885 


3671 


5457 


7R4CTP2 1 flD 




100 


1886 


3672. 


5458 


7B4CTP2 1 m 
/uiv»ir6 lux 


702 R 


101 


1887 


3673 


5459 


784CIP2 102 


702 9 


102 


1888 


3674 


5460 


784CIP2 103 


70^1 


103 


1889 


3675 


5461 


784CTP2 104 




104 


1890 


3676 


5462 


784CIP2 105 


■7033 


105 


1891 


3677 


54^3 


784CIP2 106 


7035 


106 


1892 


3<^8" " 


5464 


784CIP2 107 


7036 


107 


1893 


3679 


5465 


784CIP2 108 


7039 


108 


1894 


3680 


5466 


784CIP2 109 


7043 


109 


1895 


3681 


5467 


784CIP2 110 


7044 


110 


1896 


3682 


5468 


784CIP2 111 


7046 


111 


1897 


3683 


5469 


784CIP2 112 


7054 


112 


1898 


3684 


5470 


784CIP2 113 


7061 


113 


1899 


3685 


5471 


784CIP2_114 


7077 


114 


1900 


3686 


5472 


784CIP2 115 


7092 


115 


1901 


3687 


5473 


784CIP2 116 


7094 


116 


1902 


3688 


5474 


784CIP2_117 


7106 


117 


1903 


3689 


5475 


784CIP2 118 


7107 


118 


1904 


3690 


5476 


784CIP2 119 


7111 


119 


1905 


3691 


5477 " 


784CIP2_120 


7123 


120 


1906 


3692 


5478 


784CIP2 121 


7142 


121 


1907 


3693 


5479 


784CIP2 122 


7142" 
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SEQ ID NO: 


SEQ ID 






Priority 


SEQ ID 


of full- 


NO: of 


of con tig 


NO : 




ran • -) n 


length 


full- 


nucleotide 


of conticj" 


cor respon ding 


U. S . S . N . 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488 , 725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




122 


1908 


3694 


5480 


784CIP2JL23 


7154 


123 


1909 


3695 


5481 


784CIP2_124 


7160 


124 


1910 


3696 


5482 


784CIP2 125 


7169 


125 


1911 


3697 


5483 


784CIP2JL26 


7185 


126 


1912 


3698 


5484 


784CIP2 127 


7197 


127 


1913 


3699 


5485 


784CIP2_128 


7219 


128 


1914 


"3700 


5486 


784CIP2_129 


7226 


129 


1915 


3701 


5487 


784CIP2__130 


7229 


130 


1916 


3702 


5488 


784CIP2_131 


7234 


131 


1917 


3703 


5489 


784CIP2_132 


7235 


132 


1918 


3704 


5490 


784CIP2 133 


7235 


133 


1919 


3705 


5491 


784CIP2 134 


7238 " 


134 


1920 


3706 


5492 


784CIP2_135 


7247 


135 


1921 


3707 


5493 


784CIP2_136 


7261 


136 


1922 


3708 


5494 


784CIP2_137 


7262 


137 


1923 


3709 


5495 


784CIP2 138 


7267 


138 


1924 


3710 


5496 


784CIP2 139 


7272 


139 


1925 


3711 


5497 


784CIP2 140 


7273 


140 


1926 


3712 


5498 . 


784CIP2 141 


7282 


141 


1927 


3713 


5499 


784CIP2 142 


7288 


142 


1928 


3714 


5500 


784CIP2 143 


7291 


143 


1929 


3715 


5501 


784CIP2 144 


7293 


144 


1930 


3716 


5502 


784CIP2 145 


7294 


145 


1931 


3717 


5503 


784CIP2 146 


7299 


146 


1932 


3718 


5504 


784CIP2 147 


7300 


147 


1933 


3719 


5505 


784CIP2 148 


7312 


148 


1934 


3720 


5506 


784CIP2 149 


7313 


149 


1935 


3721 


5507 


784CIP2_150 


7315 


150 


1936 


3722 


5508 


784CIP2_151 


7318 


151 


1937 


3723 


5509 


784CIP2_152 


7321 


152 


1938 


3724 


5510 


784CIP2_153 


7330 


153 


1939 


3725 


5511 


784CIP2_154 


7331 


154 


1940 


3726 


5512 


784CIP2_155 


7333 


155 


1941 


3727 


5513 


784CIP2JL56 


7350 


156 


1942 


3728 


5514 


784CIP2_157 


7352 


157 


1943 


3729 


5515 


784CIP2_158 


7384 


158 


1944 


3730 


5516 


784CIP2_159 


7403 ] 


159 


1945 


3731 


5517 


784CIP2 160 


7431 


160 


1946 


3732 


5518 


784CIP2_161 


7441 


i£i 


1947 


3733 


5519 


784CIP2JL62 


7453 


1*2 


1948 


3734 


5520 


784CIP2 163 


7467 


163 


1949 | 


3735 


5521 


784CIP2 164 


7471 


164 


1950 


3 736 


5522 


784CIP2 16t5 


7493 


1*5 


1951 


3737 


5523 


784CIP2_166 


7502 


16S 


1952 


3738 


5524 


784CIP2 167 


7511 


167 


1953 


3739 


5525 


784CIP2 168 


7514 


168 


1954 


3740 


5526 


784CIP2 169 


7556 '"' 


169 


1955 


3741 


5527 


784CIP2_170 


7541 


170 


1956 


3742 


5528 


784CIP2 171 


7570 


171 


1957 


3743 


5529 


784CIP2 172 


7578 


172 


1958 


3744 


5530 


784CIP2 173 


7583 


173 


1959 


3745 


5531 


784CIP2 174 


7592 


174 


1960 


3746 


5532 


784CIP2_175 


7601 


175 


1961 


3747 


5533 


784CIP2 176 


7602 


1 l 7 ^ 


1962 


3748 


5534 


7B4CIP2_177 j 


7608 


177 


1963 


3749 


5535 


784CIP2 178 


7615 


178 


1964 


3750 


5536 


784CIP2 179 


7617 


179 


1965 


3751 


5537 


784CIP2 181 


7624 


180 


1966 


3752 


5538 


7B4CIP2 182 


7£2£ 


181 


1967 


3753 1 


5539 


784CIP2 183 


7640 


182 


1968 


3754 


5540 


784CIP2_184 


7641 


183 


1969 


3755 


5541 


784CIP2 185 


7641 
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SEQ ID NO : 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


.SEQ ID 


or full- 


NO: of 


of contig 


NO : 


docket number_ 


NO: in 


length 


full- 


nucleot ide 


of con tig 


corresponding 


U. S .S .N . 




length 




pep u Luc 


oCiij iu lh\j i in 


U3 / 4 bo / //3 




P*-P l_ 1UC 


















a on 1 i ea t i on 




184 


1970 


3756 


554 2 


784CIP2 186 


7641 


185 


1971 


3757 


5543 


784CIP2 187 


7642 


186 


1972 


3758 


5544 


784CIP2 188 


7649 


187 


1973 


3759 


5545 


784CIP7 189 


7656 


188 


1974 


3760 


554 6 


784CIP2 190 


7657 


189 


1975 


3761 


5547 


784CIP2 191 


7657 


190 


1976 


3762 


5548 "" 


784CIP2 197 


7662 


191 


1977 


37"6*3 


5549 


784CIP2 193 


7668 


192 


1978 


3764 


5550 


784PTP7 1 94 


7673 


193 


1979 

13 / J 


3765 


CCC1 
JJJl 


7 8 4CTP9 1 


7fi QO 


194 " ' " ' 


1980 


3766 


5552 




7700 


195 


1981 


37^7 




7B4CIP9 197 


770 Q 


196 


1982 


3768 


5554 


784PTP7 1 98 


7736 


197 


1983 


3769 


5555 


7A4PTP9 1 QQ 
/ wivir£ j. j j 


7737 


198 


1984 


3770 


5556 


784PTP7 700 


7 744 


199 


1985 


3771 


5557 


784f'TP7 201 


7771 


200 


1986 


3772 


5558 


784PTPP 9 Qp 


7786 


201 


198 7 


3 773 


5559 


7 8 4PTP9 901 


7791 


202 


1988 


3774 


5560 


784PTP2 204 


7 7 97 


203 


1989 


3775 


5561 


7R4PTP7 70^ 


7806 


204 


1990 


3776 


5562 


7R4PTP9 90£ 


7 812 


205 


1 QQ1 


3 777 


^£3 


7fl4f 1 TP9 207 


7ft1 7 
/ Ol« 


206 


1992 


3778 


5564 




7 818 


207 


1 QQ3 
X 3 33 


•3 / / 3 


33 D 3 


7RAPTP9 20Q 


7 89 7 


208 


1 QQ4. 


J /ou 


CC/Tf 
3JOD 


7 QAPTDO 9 1 A 


7 ft 9 7 

IBB / 


n n Q 


1 QQC 


1 *7Q1 


33D / 


/D^LlrZ .411 


/OJ U 


9i n 


1370 


J / oz 


330 O 


7ft/ir"TD9 919 
/ 0 ^ ^1 fi^Z 1 4 


•7 fl 1 C 


71 1 


1 QQ7 
133 / 


3 /OJ 


33D 3 


7 ft/1 PT D9 9 1 A 
/O^Ll Jr 4 Z 1 *k 


7fl a n 




1 QQD 

J. 3 3 O 




33 / U 


TPAPTD9 9 1 c 
/ O fit, 1 f* ^ 1 3 


7ft C Q 
/ O 3 O 


213 


1999 


3 785 


5571 


7R4PTP9 2 1 K 


7858 


214 


z u u u 


3786 


*^77 
33 


784PTP9 917 


7fi^1 
/ 0 O 1 




9 nm 


3 / O / 


<^73 

33 / j 


7 R APT D9 91ft 


/ODD 


216 


2 002 


3788 


5574 


7R4PTP9 91 Q 


7868 


217 


2 003 


3789 


5575 


7 ft/i r*T D9 9 9 n 


7896 


218 


2004 


3790 


5576 


7R4PTD9 991 


7ft Q ft 
/D3D 


219 


2 005 


3791 


5577 


7R4PTP2 229 


7900 


220 


2006 


37Q2 
•J / 3* 


RR7 R 
33 / O 


7R4PTP7 79"? 


»3UD 


221 


2 007 


J / 33 


5579 


7R4PTP7 274 


7 QO R 
/ 3VO 


222 


2008 


"1 794 


5580 


784PTP7 77«; 


7QO Q 
/ 3u3 


223 


2009 


3795 


5581 


784PTP2 22fi 


7917 


224 


2010 


3796 


5582 


784CrP2 227 


7932 ' ' 


225 


2011 


3797 


" ' 5583 


784PTP2 228 


7940 


226 


2012 


3798 


5584 


784PIP2 229 


794 0 


227 


2013 


3799 ! 


5585 


784PTP2 710 


7984 


228 


2014 


3600 


5586 


784PTP7 231 1 


7984 i 


229 


2015 


3801 ! 


5587 


784C?IP2 232 " " 


8001 


230 


20l£ "" 


3802 


'5588 ' " 


784CIP2 233 


8021 


231 


2017 


3803 


5589 


784CIP2 234 


8029 


232 


2018 


3 804 


5590 


784PIP? 73S 


8033 


233 


2019 


3805 


5591 


784PTP7 736 


8040 


234 


202 0 


3806 


5592 


784PTP9 937 


8052 


235 


2021 


3807 


5593 


784PTP9 73fl 


8096 


236 


2022 


3808 


5594 


784CIP2 239 


8096 


237 


2023 


3809 


5595 


784CIP2_240 


8113 


238 


2024 


3810 


5596 


784CIP2_241 


8126 


239 


2025 


3811 


5597 


784CIP2_242 


8132 


240 


2026 


3812 


5598 


784CIP2_243 


8137 


241 


2027 


3813 


5599 


784CIP2_244 


8137 


242 


2028 


3814 


5600 


784CIP2_245 


8139 


243 


2029 


3815 


5601 


784CIP2 246 


8159 


244 


2030 


3816 


5602 


784CIP2_247 | 


8161 


245 


2031 


3817 


5603 


784CIP2_248 


8176 
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cpn IT) MO- 


SEQ ID 


cirn t n xrr>. 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO : of 


o"F ront io 


NO : 




Mn ,{ n 


length 


full- 


nucleotide 


of contig 


fnrfPKnnnH i nrr 


U S . S . N . 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488 725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




246 


2032 


3818 


5604 


784CIP2 249 


8196 


247 


2033 


3819 


5605 


784CIP2 250 


8200 


248 


2034 


3820 


5606 


784CIP2 251 


8212 


249 


2035 


3821 


5607 


784CIP2 252 


8220 


250 


2036 


3822 


5608 


784C1P2 253 


823 8 


251 


2037 


3823 


5609 


784CIP2 254 


8254 


252 


2038 


3824 


5610 


784CIP2 255 


8255 


253 


2039 


3825 


5611 


784CIP2 256 


8288 


254 


2040 


3826 


5612 


784CIP2 257 


8296 


255 


2041 


3827 


5613 


784CIP2 258 


8329 


256 


2042 


3828 


5^14 


784CIP2 259 


8362 


257 


2043 


3829 


5615 


784CIP2 260 


8429 


258 


2044 


3830 


5616 


784CIP2 261 


8436 


259 


2045 


3831 


5617 


784CTP2 26? 


844 8 


260 


2046 


3832 


5618 


784CTP2 26^ 


8472 


261 


2047 


3833 


5619 


784CIP2 264 


8502 


262 


2048 


3834 


5620 


7H4CTP2 265 


fl 5fl4 


263 


2049 


3835 


5621 


7R4CTP2 266 


09U / 


264 


2050 


3836 


5622 


7fl4PTP9 76fl 




265 


2051 


3837 


5623 


7 84CIP2 269 


8515 


266 


2052 


3838 


5624 


784CTP2 270 


8519 


267 


2053 


3839 


5625 


784CTP2 271 


853 0 


268 


2054 


3840 


5626 


784CIP2 272 


8532 1 


269 


2055 


3841 


5627 


784CIP2 273 


853 2 


270 


2056 


3842 


5628 


784CTP2 274 


8539 


271 


2057 


3843 


5629 


784CIP2 275 


8541 


272 


2058 


3644 


5630 


784CIP2 276 


8543 


273 


2059 


3845 


5631 


784CIP2 277 


8593 


274 


2060 


3846 


5632 


784CIP2 278 


8595 


275 


2061 


3847 


5633 


784CIP2 279 


8615 


276 


2062 


3848 


5634 


784CIP2 280 


8620 


277 


2063 


3849 


5635 


784CIP2 281 


8621 


278 


2064 


3850 


5^36 


784CIP2 282 


8623 


279 


2065 


3851 


563 7 


784CTP2 283 


8625 


280 


2066 


3852 


5638 


784CIP2 284 


8628 


281 


2067 


3853 


5639 


7A4PTP? 2R5 


862 8 


282 


2068 


3 854 


5640 


784CIP2 28"g 


8629 


283 


2069 


3 855 


5641 


784CIP2 2R7 


ODJV 


284 


2070 


3856 


5642 


784CTP2 2flft 


863 i""' " 


265 


2071 


3857 


5643 


784CIP2 2R9 


8633 


286 


2072 


3858 


5644 


784CIP2 290 


8634 


287 


2073 


3859 


5645 


784CIP2 291 


8635 


288 


2074 


3860 


5646 


784CIP2 292 

' W X XT Xi A J A 


8636 


289 


2075 


3861 


5647 


" 784CIP2 293 


8659 


290 


2076 


3862 


5648 


784CIP2 294" 


8£S~0 


291 


2077 


3863 


5^49 


784CIP2 295 

f U X XT £m £r ^ *J 


8667 


292 


2078 


3864 


5650 


784CIP2 296 


8667 


293 


2079 


3865 


5651 


784CIP2 297 


8685 


294 


2080 


3866 


5652 


784CIP2 298 


8805 


295 


2081 


3867 


5653 


784CIP2 299 


8896 


296 


2062 


3868 


5654 


784CIP2 300 


8976 


297 


2083 


3869 


5655 


784CIP2 301 


9046 


298 


2084 


3870 


5656 


784CIP2_302 


9048 


299 


2085 


3871 


5657 


784CIP2 303 


9116 


300 


2086 


3872 


5658 


784CIP2 304 


9195 


301 


2087 


3873 


5659 


784CIP2_305 


9201 


302 


2088 


3874 


5660 


784CIP2 306 


9307 


303 


2089 


3875 


5661 


784CIP2_307 


9321 


304 


2090 | 


3876 


5662 


784CIP2_308 


9397 


305 ! 


2091 


3877 


5663 


784CIP2_309 


9405 


306 


2092 


3878 


5664 


784CIP2 310 


9406 


307 


2093 


3879 


5^5 


784CIP2 311 


9422 
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SEQ ID 


SEQ ID NO : 


SEQ ID 


Priority 


SEQ ID 


of 1 fill 1 - 
QX tull- 


inu : or 


of contig 


NO : 


docket number^ 


NO: in 


lencjth 


fill l - 




or contig 


corresponding 


U.S. S.N. 


nucleot ide 


length 








no / « 00 tic 
Uy/ 4tJo , /2b 


secjuen.ce 














sequence 






sippl i cat i on 




308 


2094 


3880 


5666 


784CIP2 31? 




309 


2095 


3881 


5667 


784CIP2 313 


9512 


310 


2096 


3882 


5668 


784CIP2 314 


9632 


311 


2097 


3883 


5669 


784CIP2 315 


9661 


312 


2098 


3884 


5670 


784CIP2 316 


9664 


313 


2099 


3885 


5671 


784CIP2 317 


9691 "" " 


314 


2100 


3886 


5672 


' 784CIP2 318 


9700 


315 


2101 


3887 


5673 


784CIP2 319 


9716 


316 


2102 


3888 


" 5674 


784CIP2 320 


9721 


317 


2103 


3889 


5675 


7B4CIP2 321 


9870 


318 


2104 


3890 


5676 


784CIP2 322 


9887 


319 


2105 


3891 


5677 


7S4PTP7 17"? 


9923 


320 


2106 


3892 


5678 


784PTP7 T7A 


J» J J O 


321 


2107 


3893 


5679 


7R4PTP7 75c; 




322 


2108 


3894 


5680 


7S4PTP7 37G 


10007 


323 


2109 


3895 


5681 


7P.4PTP7 157 


i nnno 

X UUUJ 


324 


2110 


3896 


5682 


' 7R4PTP? T7ft 


X U U*±b 


325 


2111 


3897 


5683 


/OlV^XtrZ 


XU XDO 


326 


2112 


3898 


5684 




7 one 


327 


2113 


3899 


5685 


f 0*iV.Xf J O X 


Ivfi OJ 


328 


2114 


3900 


5686 


7B4PTP9B 1 


XDZ 


329 


2115 


3901 


5687 




1 C7 
XO / 


330 


2116 


3902 


5688 


7R4PTD7U "3 


z us 


331 


2117 


3903 


5689 


7ftdrTD7Tl A 


0 7 n 
Z XU 


332 


2118 


3904 


5690 


7QAPTD7H C 
/ 0*±\_XirZi3 D 




333 


2119 


3905 


5691 


*7 ftdHTDOTi £ 
/ OHV^Xlr ZiO 0 


•J7C 
A CO 


334 


2120 


3906 


5692 




OKA 


335 


2121 


3907 


5693 


7ftdPTD7R Q 


Zoo 


336 1 


2122 


3908 


5694 




2 93 


337 


2123 


■a qnq 


DO 3D 




2 93 


338 


2124 


3910 


5696 


7fldPTP7R 7 7 


•5 

i ^ 


339 


2125 


3911 


5697 




J Uz 


340 


2126 


3912 


5698 


7H4PTP7R 1 "3 


J XX 


341 


2127 


3913 


5699 


7A4PTP9R 1 A 




342 


2128 


3914 


5700 


f OH\mXJrAD XD 


■3CQ 
J JO 


343 


2129 


3915 


5701 




•JCQ i 
JDO 


344 


213 0 


3916 


5702 


7P.4PTD7R 7 7 


j y j 


345 


2131 


3917 


5703 


784PTP2B T ft 


477 


346 


2132 


3918 


5704 


784PTP7B 19 


DUO 


347 


2133 


3919 


5705 


7R4PTP7R 70 


508 


348 


2134 


3920 


5706 


7R4CIP2R 21 


51"5"" ' 


349 


2135 


3921 


5707 


7B4PTP7R 59 




350 


2136 


3922 


5708 


7R4PTP7R 33 


588 


351 


2137 


3923 


5709 


7R4PTP7R 74 


D J X 


352 


2138 


3924 


5710 


7R4PTP7R 7** 


DSD 




2139 


3925 


5711 


7A4PTP7R 76 


^ Q4 


354 


2140 


3926 


5712 


7R4PTP7R 77 


OX? 


355 


2141 


3927 


5713 


7R4PTP7R 7ft 


C7A 


356 


2142 


3928 


5714 


7R4PTP7R 7Q 


654 


357 


2143 


3929 


5715 


784PTP7R 


692 


358 


2144 


3930 


^71 K 

D / J-O 


7fldPTDOn 17 


TCI 
» JJ 


3 59 


2145 


3931 


D /X / 




*7 CO 
/DO 


360 


2146 


3932 


5718 


784CIP2B 33 


787 


361 


2147 


3933 


5719 


784CIP2B_34 


833 


362 


2148 


3934 


5720 | 


7B4CTP2B_35 


838 


363 


2149 


3935 


5721 [ 


7B4CIP2B_36 


870 


364 


2150 


3936 


5722 


7B4CIP2B_37 


891 j 


365 


2151 


3937 


5723 


784CIP2B_38 


891 


366 


2152 


3938 


5724 


7B4CIP2BJ39 


921 


367 


2153 


3939 


5725 


784CIP2B_40 


924 


368 


2154 


3940 


5726 


784CIP2B__41 


932 


369 


2155 


3941 


5727 


784CIP2B_42 


942 
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SEQ ID NO : 


SEQ ID 


SRO ID NO - 




Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO : 


UOCJVCU IIUUUJCI- 


inu : in 


length 


full- 


nucleotide 


of nonhicr 




U. S .S . 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488 725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




370 


2156 


3942 


5728 


784CIP2B_43 


958 


j 371 


2157 


3943 


5729 


784CIP2B_44 


968 


372 


2158 


3944 


5730 


784CIP2B_45 


992 


373 


2159 


3945 


5731 


784CIP2B 46 


1025 


374 


2160 


3946 


5732 


784CIP2B 47 


1074 


375 


2161 


3947 


5733 


784CIP2B 48 


1104 


376 


2162 


3948 


5734 


784CIP2B 49 


1114 


377 


2163 


3949 


5735 


784CIP2B 50 


1144 


378 


2164 


3950 


5736 


784CIP2B 51 


1262 


379 


2165 


3951 


5737 


784CIP2B 52 


1318 


380 


2166 


3952 


5738 


784CIP2B 53 


1319 


381 


2167 


3 953 


573 9 


784CIP2B 54 


132B 


382 


2168 


3954 


5740 


784CIP2B 55 


1436 


383 


2169 


3955 


5741 


784CIP2B 56 


1464 


384 


2170 


3956 


5742 


784CIP2B 57 


1584 


385 


2171 


3957 


574 3 


784CIP2B 58 ' 


1617 


384 


2172 


3958 


5744 


784CIP2B 59 


1724 


387 


2173 


3959 


5745 


784CIP2B 60 


1728 


388 


2174 


3960 


5746 


784CIP2B 61 


1772 


389 


2175 


3961 


5747 


784CIP2B 62 


1809 


390 


2176 


3962 


5748 


784CIP2B 63 


1868 


391 


2177 


3963 


5749 


784CIP2B 64 


1898 


392 


2178 


3964 


5750 


784CIP2B 65 


1926 


| 393 


2179 


3965 


5751 


784CIP2B 66 


19^5 


394 


21B0 


3966 


5752 


784CIP2B 67 


1967 


395 


2181 


3967 


5753 


784CIP2B 68 


1995 


396 


2182 


3968 


5754 


784CIP2B 69 


2005 . 


397 


2183 


3969 


5755 


784CIP2B 70 


2027 


398 


2184 


3970 


5756 


784CIP2B 71 


2055 


; 399 


2185 


3971 


5757 


784CIP2B 72 


2103 


400 


2186 


3972 


5758 


784CIP2B 73 


2106 


401 


2187 


3973 


5759 


784CIP2B 74 


2166 


402 


2188 


3974 


5760 


784CIP2B 75 


2175 


403 


2189 


3975 


5761 


784CIP2B 76 


2176 


404 


2190 


3976 


5762 


784CIP2B 78 


223 6 


405 


2191 


3977 


5763 


784CIP2B 79 


2250 


406 


2192 


3978 


5764 


784CIP2B 80 


2300 


407 


2193 


3979 


• 5765 


784CIP2B 81 


2323 


408 


2194 


3980 


5766 


784CIP2B 82 


2340 


409 


2195 


3981 


5767 


784CIP2B 83 


2371 


410 


2196 


3982 


5768 


784CIP2B 84 


2399 


411 


2197 


3983 


5769 


784CIP2B_85 


2411 


412 


2198 


3984 


5770 


784CIP2B 86 


2428 


413 


2199 


3985 


5771 


784CIP2B 87 


2430 


414 


2200 


3 986 


5772 


784CIP2B 88 


2439 


415 


2201 


3987 


5773 


784CIP2B 89 


2447 


416 


2202 


3988 


5774 


784CIP2B 90 


2461 


417 


2203 


3989 


5775 


784CIP2B 91 


2487 


418 


2204 


3990 


5776 


784CIP2B 92 


2492 


419 


2205 


3991 


5777 


784CIP2B 93 


2512 


420 


2206 


3992 


5778 


784CIP2B 94 


2564 


421 


2207 


3993 


5779 


784CIP2B 95 


2678 


422 


2208 


3994 


5780 


784CIP2B 96 


2816 


423 • 


2209 


3995 


5781 


784CIP2B_97 


2818 


424 


2210 


3996 


5782 


784CIP2B_98 


2819 


425 


2211 


3997 


5783 


784CIP2B_99 


2943 


426 


2212 


3998 


5784 


784CrP2B 100 


3137 


427 


2213 


3999 


5785 


784CIP2B__101 


3137 


428 


2214 


4000 


5786 


784CIP2B 102 


3160 


429 


2215 


4001 


5787 


784CIP2B_103 


3323 


430 


2216 


4002 


5788 


784CIP2B 104 


3360 


431 


2217 


4003 


5789 


784CIP2B 105 


3352 
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CEO ID NO • 




ceo Tn MO - 


ocy i u 


Priority 


SEQ ID 


of full- 


NO : of 


w A* WWA L. AM 


NO : 


uocKct nuiuuer 


NO : in 


length 


full- 


nucleotide 


of conticf 


nnyvf* ";nnnH "i Tier 


n c e m 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/4R8 77S 


sequence 


peptide 
sequence 




sequence 


priority 
application 




432 


2218 


4004 


5790 


784CIP2B 106 


3417 


433 


2219 


4005 


5791 


784CIP2B_107 


3418 


434 


2220 


4006 


5792 


784CIP2B_108 


3442 


435 


2221 


4007 


5793 


784CIP2B 109 


3442 


436 


2222 


4008 


5794 


784CIP2B 110 


3444 


437 


2223 


4009 


5795 


784CIP2B_111 


3855 


438 


2224 


4010 


5796 


784CIP2B_112 


3863 


439 


2225 


4011 


5797 


784CIP2B 113 


4090 


440 


2226 


4012 


5798 


784CIP2B 114 


4105 


441 


2227 . 


4013 


5799 


784CIP2B 115 


4142 


442 


2228 


4014 


5800 


784CIP2B_116 


4142 


443 


2229 


4015 


5801 


784CIP2B_117 


4149 


444 


2230 


4016 


5802 


784CIP2B 118 


4196 


445 


2231 


4017 


5803 


784CIP2B 119 


4202 


446 


2232 


4018 


5804 


784CIP2B_120 


4274 


447 


2233 


4019 


5805 


784CIP2B 121 


4304 


448 


2234 


4020 


5806 


784CIP2B 122 


4306 


449 


2235 


4021 


5807 


784CIP2B 123 


4311 


450 


2236 


4022 


5808 


784CIP2B 124 


4321 


451 


2237 


4023 


5809 


784CIP2B 125 


4323 


452 


2238 


4024 


5810 


784CIP2B 126 


4332 


453 


2239 


4025 


5811 


784CIP2B 127 


1 44 88 


454 


2240 


4026 


5812 


784CIP2B 128 


4588 


455 


2241 


4027 


5813 


784CIP2B 129 


5569 


456 


2242 


4028 


5814 


784CIP2B 130 


5573 


457 


2243 


4029 


5815 


784CIP2B 131 


5577 


458 


2244 


4030 


5816 


784CIP2B 132 


5579 


459 


2245 


4031 


5817 


784CIP2B 133 


5582 


460 


2246 


4032 


5818 


784CIP2B 134 


5583 


461 


2247 


4033 


5819 


784CIP2B 135 


5584 


4<J2 


2248 


4034 


5820 


784CIP2B 136 


5585 


463 


2249 


4035 


5821 


784CIP2B 137 


5591 


464 


2250 


4036 


5822 


784CIP2B_138 


5593 


465 


2251 


4037 


5823 


7B4CTP2B_139 


5594 


466 


2252 


4038 


5824 


784CIP2B_140 


5594 


467 


2253 


4039 


5825 


784CIP2B_141 


5598 


468 


2254 


4040 


5626 


784CIP2B_142 


5602 


469 


2255 


4041 


5827 


784CIP2B 143 


5605 


470 


2256 


4042 


5828 


784CIP2B 144 


5608 


471 


2257 


4043 


5829 


784CIP2B 145 


5617 


472 


2258 


4044 


5830 


784CIP2B_146 


5620 


473 


2259 


4045 


5831 


784CIP2B_147 


5622 


474 


2260 


4046 


5832 


784CIP2B 148 


5623 


475 


2261 


4047 


5833 


784CIP2B_149 


5624 


476 


2262 


4048 


5834 


784CIP2B_150 


5625 


477 


2263 


4049 


5835 


784CIP2B 151 


5627 


478 


2264 


4050 


5836 


784CIP2B 152 


5^28 


479 


. 2265 


4051 


5837 


784CIP2B 153 


5630 


480 


2266 


4052 


5838 


784CIP2B 154 


5632 


481 


2267 


4053 


5839 


784CIP2B 155 


5640 


482 


2268 


4054 


5840 


784CIP2B 156 


5641 


483 


2269 


4055 


5841 


784CIP2B 157 


5643 


484 


2270 


4056 


5842 


784CIP2B_158 


5647 


485 


2271 


4057 


5843 


784CIP2B_159 


564 9 


486 


2272 


4058 


5844 


784CIP2B_160 


5658 


487 


2273 


4059 


5845 


784CIP2B_161 


5659 


488 


2274 


4060 


5846 


784CIP2B_162 


5667 


489 


2275 


4061 


5847 


784CIP2B_163 


5672 


490 


2276 


4062 


5848 


784CIP2B 164 


5674 


_491 


2277 


4063 


5849 


784CIP2B 165 


5678 


492 


2278 


4064 


5850 


784CIP2B 166 


5680 


493 


2279 


4065 


5851 


784CIP2B 167 


5684 
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CPA TT> 


Priority 


SEQ ID 


of full- 


NO: of 


of contiq 


NO : 


UULJVC L I1U.UIDSJ. 


NO: in 


length 


full- 


nucleotide 


of contiq 


c o r r e s pondi nq 




nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


no /ado 

U J / 4DO, / 43 


sequence 


peptide 




sequence 


priority 






sequence 






application 




494 


2280 


4066 


5852 


784CIP2BJL68 


5686 


495 


2281 


4067 


5853 


784CIP2BJL69 


5694 


496 


2282 


4068 


5854 


784CIP2BJL70 


5698 


497 


2283 


4069 


5855 


784CIP2B 171 


5699 


498 


i_ 2284 


4070 


5856 


784CIP2B__172 


5712 


499 


2285 


4071 


5857 


784CIP2B_173 


5719 


500 


2286 


4072 


5858 


784CIP2B_174 


5720 


501 


2287 


4073 


5859 


784CIP2BJL75 


5727 


502 


2288 


4074 


5860 


784CIP2B_176 


5730 


503 


2289 


4075 


5861 


784CIP2B 177 


5734 


504 


2290 


4076 


5862 


784CIP2B 178 


5738 


505 


2291 


4077 


5863 


784CIP2B 179 


5739 


506 


2292 


4078 


5864 


784CIP2B_180 


5740 


507 


2293 


4079 


5865 


784CIP2B_181 


5744 


508 


2294 


4080 


5866 


784CIP2B 182 


5748 


509 


2295 


4081 


5867 


784CIP2B_183 


5749 


510 


2296 


4082 


5868 


784CIP2B_184 


5750 


511 


2297 


4083 


5869 


784CIP2B 185 


5750 


512 


2298 


4084 


5870 


7B4CIP2B 186 


5750 


513 


2299 


4085 


5871 


7B4CIP2B 187 


5761 


514 


2300 


4086 


5872 


784CIP2B 188 


5762 


515 


2301 


4087 


5873 


784CIP2B 189 


5767 


516 


2302 


4088 


5874 


784CIP2B 190 


5773 


517 


2303 


4089 


5875 


784CIP2B 191 


5783 


518 


2304 


4090 


5876 


7B4CIP2B 192 


5784 


519 


2305 


4091 


5877 


7B4CIP2B 193 


5788 


520 


2306 


4092 


5878 


784CIP2B 194 


5798 


521 


2307 


4093 


5879 


784CIP2B 196 


5807 


522 


2308 


4094 


5880 


784CIP2B 197 


5818 


523 


2309 


4095 


5881 


784CIP2B 198 


5819 


524 


2310 


4096 


5882 


7B4CIP2B 199 


5827 


525 


2311 


4097 


5883 


784CIP2B 200 


5828 


526 


2312 


4098 


5884 


7B4CIP2B 201 


5842 


527 


2313 


4099 


5885 


784CIP2B 202 


5853 


528 


2314 


4100 


5886 


7B4CIP2B 203 


5861 


529 


2315 


4101 


5887 


784CIP2B 204 


5864 


530 


2316 


4102 


5888 


784CIP2B 205 


5865 


531 


2317 


4103 


5889 


784CIP2B 206 


5871 


532 


2318 


4104 


5890 


784CIP2B 207 


5873 


533 


2319 


4105 


5891 


784CIP2B 208 


5873 


534 


2320 


4106 


5892 


784CIP2B 209 


5875 


535 


2321 


4107 


5893 


784CIP2B 210 


5878 


536 


2322 


4108 


5894 


784CIP2B 211 


5879 


537 


2323 


4109 


5895 


784CIP2B_212 


5880 


538 


2324 


4110 


5896 


784CIP2B 213 


5880 


539 


2325 


4111 


5897 


7B4CIP2B_ 214 


5880 


540 


2326 


4112 


5898 


784CIP2B 215 


5880 


541 


2327 


4113 


5899 


784CIP2B_216 


5885 


542 


2328 


4114 


5900 


784CIP2B_217 


5895 


543 


2329 


4115 


5901 | 


784CIP2B 218 


5898 


544 


2330 


4116 j 


5902 


784CIP2B 219 


5902 


545 


2331 


4117 


5903 


784CIP2B 220 


5904 


546 


2332 


4118 


5904 


784CIP2B_221 


5918 


547 


2333 


4119 


5905 


784CIP2B_222 


5921 


548 


2334 


4120 


5906 


784CXP2B_223 


5927 


549 


2335 


4121 


5907 


784CIP2B_224 


5932 


550 


2336 


4122 


5908 


784CIP2B 225 


5939 


551 


2337 


4123 j 


5909 


784CIP2B_226 


5945 


552 


2338 


4124 


5910 


784CIP2B_227 


5946 


553 


2339 


4125 


5911 


784CIP2B_22 8 


5947 


554 


2340 


4126 


5912 


784CIP2B_229 j 


5956 


555 


2341 


4127 


5913 


784CIP2B 230 


5967 



279 



WO 01/53312 PCT/US00/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


pept ide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 




sequence 


priority 






sequence 






appl i csifc ion 




CCfT 


1 1 A 1 

A J % 2 


4126 


Cfll A 


TQAPTDOU 111 


5975 


C C*7 
33 / 


2343 


4129 


by lb 


*7Q/1f'TTnO too 
/ U4d Jt2B <jj 


5977 


rco 
ODO 


2 J44 




bi? lb 


TD/IPTDID HA 

f o4Llr^D^j4 


5978 




£J%3 


A 1 "J 1 


byi / 


f o4v~1F4;d\_2 Jb 


5979 




"> i a a 


4 132 


byio 


' o^Clr 2o ^jo 


5980 


bo J. 


2.J4 / 


A "\ 11 

41 33 


byiy 


/o^^iFio 23/ 


5988 


bo2 


2348 


4 134 


5920 


7 84CIP2B_23o 


5989 


bo J 




4 X3b 


by<s J. 




5991 


bo4 


2jbu 


4 Ijo 






5997 


rrr 
DOD 


2351 


4137 


5923 


/o4L1P2B_241 


5998 


bob 




4138 


by24 


/ o4CIP^D_242 


6003 


bo / 




4 139 


coo c 
by z b 


TQAOTOOTJ lAI 

/o4 Ul r 2 B__Z 4 J 


6004 


ceo 

DO 0 


2354 


4 140 


5926 




6013 


569 


2355 


4 141 


5927 


7o4CIF2B__24b 


6028 


b /U 




4142 


5928 


/o4LiPiiB 24b 


6028 


571 




4 143 


5929 


7 84CIP2B__24 / 


6029 


572 


2358 


4 144 


5930 


7 o 4 CI r 2 B^_2 4 o 


6031 


573 


2359 


4145 


5931 


784CIP2B__249 


603 1 


574 


2360 


4146 


5932 


7 84CIP2B_250 


6032 


575 


2361 


4147 


5933 


784CIP2B_251 


603 7 


576 


2362 


4148 


5934 


784CIP2B_252 


6037 


577 


2363 


4149 


5935 


784CIP2B_253 


6043 


578 


23 64 


4150 


5936 


784CIP2B__254 


6044 


579 


23 65 


4151 


5937 


784 CIP2B_2 55 


6046 


580 


2366 


4152 


5938 


784CIP2B_256 


6048 


581 


2367 


4153 


5939 


784CIP2B_257 


604 9 


582 


2368 


4154 


5940 


784CIP2B__258 


6051 


583 


2369 


4155 


5941 


784CTP2B__259 


6053 


584 


2370 


4156 


5942 


784CIP2B_260 


6060 


585 


2371 


4157 


5943 


784CIP2B_261 


6063 


586 


2372 


4158 


5944 


784CIP2B 262 


6066 


587 


23 73 


4159 


5945 


784CIP2B_263 


6067 


588 


23 74 


4160 


5946 


784CIP2B__264 


6068 


589 


2375 


4161 


5947 


784CIP2B_26b 


6073 


590 


23 76 


4162 


5948 


784CIP2B_266 


6076 


591 


23 77 


4163 


5949 


784CIP2B_267 


6076 


592 


2378 


4164 


5950 


784CIP2B__268 


6077 


593 


2379 


4165 


5951 


784CIP2B_269 


6079 


594 


2380 


4166 


5952 


784CIF2B_ 270 


6082 


595 


Z J o x 


4167 


5953 


784CIP2B_2 f 2 


C C\ Q Q 


596 


2382 


4168 


5954 


7 84CIP2B__27.i 


'"en at 


597 


2383 


4 169 


5955 


7o4CIF2d_2 /4 


6094 


598 


2384 


4170 


5956 


784CIP2B_27b 


6101 


599 


2385 


4171 


CQC7 

byb / 


7o4CIP2B i _2 /b 


"' " c i ni 

b 1UJ 


600 




4 172 


5958 


784CIP2B_2 77 


6104 


OUJL 


Tim 
2 Jo / 


4 173 


5959 


7 84CIP2B_2 f o 


ci no 
bJLUo 


c no 
O U2 


T7QQ 


4 174 


5960 


784CIP2B_279 


6112 


b U.3 




A T 1 C 

4 Jl /b 


CQC1 

by oi 


7 o 4 C 1 P2B 2 o U 


blzl 


oU4 




4176 


5962 


784CIP2B 281 


b!2b 


bUb 




4177 


5963 


784CIP2B_2o2 


^ri off 
b 12b 


bOo 




4178 


5964 


784CIP2B_283 


6128 


607 


23 93 


4179 


5965 


784CIP2B_284 


ti oo 

6129 


608 


2394 


4180 


5966 


/ o *t \* ± trdia & O -3 


6133 


609 


2395 


4181 


5967 


784CIP2B_286 


6133 


610 


2396 


4182 


5968 


784CIP2B_287 


6135 


611 


2397 


4183 


5969 


784CIP2B_2B8 


6139 


612 


2398 


4184 


5970 


784CIP2B 289 


6141 


613 


2399 


4185 


5971 


784CIP2B_290 


6145 


614 


2400 


4186 


5972 


784CIP2B_291 


6146 


615 


2401 


4187 


5973 


784CTP2B_292 


6146 


616 


2402 


4188 


5974 


784CIP2B 293 


6149 


617 


2403 . 


4189 


5975 


784CIP2B 294 


6149 
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CT7f} TT) KTD • 
OIj^ 11/ Iv ^ . 




OJCiV ViKJ l 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO : of 


nf c nnt* i a 


NO * 


uocKeL nurrujer 


NO : in 


length 


full- 


nucleotide 


of rnn f- 1 a 




11 C C XT 


nucleotide 


length 


sequence 


peptide 


SEO ID NO' in 


lot), / 


sequence 


peptide 




sequence 


priori ty 






sequence 






application 




618 


2404 


4190 


5976 


784CIP2B 295 


6153 


619 


2405 


4191 


5977 


784CIP2B 296 


6159 


620 


2406 


4192 


5978 


784CIP2B 297 


6164 


621 


2407 


4193 


5979 


784CIP2B 298 


6167 


622 


2408 


4194 


5980 


j 784CIP2B 299 


6172 


| 623 


2409 


4195 


5981 


784CIP2B 300 


6173 


624 


2410 


4196 


5982 


784CIP2B 301 


^190 " 


625 


2411 


4197 


5983 


784CIP2B 302 


6194 


626 


2412 


4198 


5984 


784CIP2B 3 03 


6196 


627 


2413 


4199 


5985 


784CIP2B 304 


6197 


628 


2414 


4200 


5986 


784CIP2B 305 


$196 " 


629 


2415 


4201 


5987 


784CIP2B 306 


6198 


! 630 


2416 


4202 


5988 


784CIP2B 308 


6214 


631 


2417 


4203 


5989 


784CIP2R 109 


6215 


632 


2418 


4204 


5990 


784fTP2R 11 n 


6 2 1 9 ™" 


633 


2419 


4205 


5991 


784CIP2R 111 

/ U *X V— X C X O W X X 


6226 


' 634 


2420 


4206 


5992 


784CIP2R IIP 


6229 


635 


2421 


4207 


5993 


784fTP2R 111 




636 


2422 


4208 


5994 


7R4r*TP2R n & 


623 7 


637 


2423 


4209 


5995 


7R4.PTP2R 11^ 




638 


2424 


4210 


5996 


7R4rTP2ft 11£ 




639 


2425 


4211 


5997 


784fTP2R 117 




640 


2426 


4212 


5998 


784CIP2R 118 


623 9 


641 


2427 


4213 


5999 


784C1P2R 119 


6240 


642 


2428 


4214 


6000 




£2 44 


643 


2429 


4215 


6001 


7B4CIP2B 121 

' w * W X t X X7 ^ X X 


6245 


; 644 


2430 


4216 


6002 


•784CIP2R 122 


6250 


645 


2431 


4217 


6003 


784CIP2B 323 


6252 


646 


2432 


4218 


6004 


784CIP2B 124 


6252 


647 


2433 


4219 


6005 


784CIP2B 325 


6256 


648 


2434 


4220 


6006 


784CIP2B 326 


6260"" 


649 


2435 


4221 


6007 


7S4CIP2B 327 


5261 


650 


2436 


4222 


S6&8 


784C , 1!p2b 32&" 


6264' ' 


651 


2437 


4223 


6009 


784CIP2B 329 


6265 


652 


2438 


4224 


6010 


784CIP2B 330 


6266 


653 


2439 


4225 


6011 


784CIP2B "511 


fi27n 


654 


2440 


4226 


6012 


784CIP2B 332 


6271 


655 


2441 


4227 


6013 


784CIP2B 334 


0^ it. 


656 


2442 


4228 


6014 


784CIP2R 115 




657 


2443 


4229 


6015 


784CIP2B IIS 


62R1 


658 


2444 


4230 


6016 


784CIP2B 117 


6281 


659 


2445 


4231 


6017 


784CIP2R lift 




660 


2446 


4232 


6018 


784CIP2B 339 


62 92 


661 


2447 


4233 


6019 


784CIP2B 340 


6294 


662 


2448 


4234 


6020 


784CIP2B 343 


"6312 


663 


2449 


4235 


£021 


784CIP2B 344 


63 12 


664 


2450 


4236 


6022 


784CIP2B 345 


6312 


665 


2451 


4237 


6023 


784CIP2B 346 


6322 


" 666 


2452 


4238 


6024 


784CIP2R 147 


6324 


667 


2453 


4239 


6025 


784CTP2R 149 


6329 


668 


2454 


4240 


6026 


784CTP2R l^fi 


6331 


669 


2455 


4241 


6027 


784PTP2R 1^1 


6333 


670 


2456 


4242 


6028 


784CIP2B_352 


6334 


671 


2457 


4243 


6029 


7B4CIP2B_353 


6337 


672 


2458 


4244 


6930 


784CIP2B 354 


6339 


673 


2459 


4245 


6031 


784CIP2B_355 


6346 


674 


2460 


4246 


6032 


784CIP2B 356 


634B 


675 


2461 


4247 


£0*3 


784CIP2B_357 


6348 


676 


2462 


4248 


6034 


784CIP2B 358 


6350 


677 


2463 


4249 


6035 


784CIP2B 359 


6351 | 


678 


2464 


4250 


6036 


784CIP2B 360 


6355 


679 


2465 


4251 


6037 j 


784CIP2B 361 


6362 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


QT?o in 
NO: of 
full- 
length 
peptide 
sequence 


CEO TD NO - 

of contig 

nucleotide 

sequence 


NO : 

of eonrlcr 

peptide 
sequence 


Priority 
docket nuniber__ 

SEO ID NO* in 

priority 

application 


SEQ ID 
NO: iri 
tr e c M 
fiQ/AQQ 79C 


680 


2466 


4252 


6038 


784CIP2B 362 


6368 


G81 


2467 


4253 


6039 


784CIP2B 363 


6369 


682 


2468 


4254 


6040 


784CrP2B 364 


6371 


683 


2469 


4255 


6041 


784CIP2B 365 


6376 


684 


2470 


4256 


6042 


784CIP2B 3 66 


6379 


685 


2471 


4257 


6043 


784CIP2B 367 


6380 


686 


2472 


4258 


6044 


784CIP2B 368 


6381 


687 


2473 


4259 


6045 


784CIP2B 369 


6392 


688 


2474 


4260 


6046 


784CIP2BJ370 


6395 


689 


2475 


4261 


6047 


784CIP2B 371 


6397 


690 


2476 


4262 


6048 


784CIP2B 372 


£400 


691 


2477 


4263 


6049 


784CIP2B 373 


6401 


692 


2478 


4264 


6050 


784CIP2B 374 


6411 


693 


2479 


4265 


6051 


784CIP2B 375 


6411 


694 


2480 


4266 


6052 


784CIP2B 376 


6411 


695 


2481 


4267 


6053 


784CIP2B 377 


6416 


696 


2482 


4268 


6054 


784CIP2B 378 


6418 


697 


2483 


4269 


6055 


784CIP2B 379 


6422 


698 


2484 


4270 


6056- 


784CIP2B 3R0 


6423 


699 


2485 


4271 


6057 


784CIP2B 381 


6426 


700 


2486 


4272 


6058 


784CIP2B 382 


6427 


701 


2487 


4273 


6059 


784CIP2B 383 


6428 


702 


2488 


4274 • 


6060 


784CIP2B 384 


6429 


703 


2489 


4275 


6061 


784CIP2B 385 


643 0 


704 


2490 


4276 


6062 


784CIP2B 386 


64 32 


705 


2491 


4277 


6063 


784CIP2B 397 


6432 


706 


2492 


4278 


6064 


784CIP5B ^fffl 
' o •* v_ j. tr <C D J o O 


Q4JO 


707 


2493 


4279 


6065 


784CIP2B 3flq 


6441 


708 


2494 


4280 


6066 


7B4CIP2B 390 


6446 


709 


2495 


4281 


6067 


7B4CIP2B 391 


6454 


710 


2496 


4262 


6068 


784CIP2B 392 


6459 


711 


2497 


4283 


6069 


7B4CIP2B 394 


64 61 


712 


2498 


4264 


6070 


784CtPife 395 


"T4T7 


713 


2499 


4285 


6071 


784CIP2B 396 


64 68 


714 


2500 


4286 


6072 


784CIP2B 397 


6487 


715 


2501 


4287 


6073 


784CIP2B 398 


6491 


716 


2502 


4288 


6074 


784CIP2B 399 


6506 


717 


2503 


4289 


6075 


784CIP2B 401 


6514 


718 


2504 


4290 


6076 


784CIP2B 402 


6519 


719 


2505 


4291 


6077 


784CIP2B_403 


6521 


720 


2506 


4292 


6078 


784CIP2B 404 


6532 


721 


2507 


4293 


6079 


784CIP2B 405 


6536 


722 


2508 


4294 


6080 


784CIP2B 406 


6543 


i 723 


2509 


4295 


6081 


784CIP2B_407 


6544 


724 


2510 


4296 


6082 


784CIP2B 408 


654 8 


725 


2511 


4297 


6083 


784CIP2B 409 


4551 


726 


2512 


4298 


6084 


784CIP2B_410 


6551 


727 


2513 


4299 


6085 


784CIP2B 411 


6552 


728 


2514 


4300 


6086 


784CIP2B 412 


6554 


729 


2515 


4301 


6087 


.784CIP2B 413 


6556 


730 


2516 


4302 


6088 


784CIP2B 414 


6560 


731 


2517 


4303 


6089 


784CIP2B 415 ; 


6563 


732 


2518 


4304 


6090 


784CIP2B_416 


6564 


733 


2519 


4305 


6091 


784CIP2B_417 


6567 


734 


2520 


4306 


6092 


784CIP2B 418 


6573 


735 


2521 


4307 


6093 


784CIP2B_419 


6575 


736 


2522 


4308 


6094 


784CIP2B_420 


6577 


737 


2523 


4309 


6095 


784CIP2B__421 


4593 


738 


2524 


4310 


6094 


784CIP2B_422 


6595 


739 


2525 


4311 


6097 


784CIP2B_423 


6599 


740 


2526 


4312 


6098 


784CIP2B_424 


6625 


741 


252 7 


4313 


6099 


784CIP2B_425 


662S 



282 



WO 01/53312 
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SEQ ID NO: 


SEQ ID 


^EO ID NO« 


SEQ ID 


ri iuiliy 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 




NO * in 


length 


full- 


nucleotide 


of contig 


c o r re 9 pondi ng 


U.S .S „N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: In 


09/488, 725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




742 


2528 


4314 


6100 


784CIP2B 426 


6626 


743 


2529 


4315 


6101 


784CIP2B_427 


6630 


744 


2530 


4316 


6102 


784CIP2B_428 


6631 


745 


2531 


4317 


6103 


784CIP2B_429 


6632 


746 


2532 


4318 


6104 


784CIP2B 430 


6633 


747 


2533 


4319 


6105 


784CIP2B_431 


6634 


748 


2534 


4320 


6106 


784CIP2B_432 


6638 


749 


2535 


4321 


6107 


784CIP2B 433 


^641 


750 


2536 


4322 


6108 


784dlP2B_434 


6644 


751 


2537 


4323 


6109 


784CIP2B_435 


6646 


752 


2538 


4324 


6110 


784CIP2B_436 


6648 


753 


2539 


4325 


6111 


784CIP2B_43 7 


6652 


754 


2540 


4326 


6112 


784CIP2B_438 


6654 


755 


2541 


4327 


6113 


784CIP2B_439 


6657 


756 


2542 


4328 


6114 


784CIP2B_440 


6658 


757 


2543 


4329 


6115 


784CIP2B_441 


6663 


758 


2544 


4330 


6116 


784CIP2B_442 


6664 


■759 


2545 


4331 


6117 


784CIP2B_443 


6668 


760 


2546 


4332 


6118 


784CIP2B_444 


6669 


761 


2547 


4333 


6119 


784CIP2B_445 


6673 


762 


2548 


4334 


6120 


784CIP2B__446 


6685 


763 


2549 


4335 


6121 


784CIP2B_447 


6687 


764 


2550 


4336 


6122 


784CIP2B_448 


6689 


765 


2551 


4337 


6123 


784CIP2B 449 


6693 


766 


2552 


4338 


6124 


784CIP2B_450 


6698 


767 


25S3 


4339 


6125 


784CIP2B_451 


6699 


768 


2554 


4340 


6126 


7 84CIP2B_452 


6705 


769 


2555 


4341 


6127 


784CIP2B_453 


6711 


770 


2556 


4342 


6128 


784CIP2B_454 


67.13 


771 


2557 


4343 


6129 


784CIP2B_455 


6716 


772 


2558 


4344 


6130 


784CIP2B_456 


6725 


773 


2559 


4345 


6131 


784CIP2B_457 


6726 


774 


2560 


4346 


6132 


784CIP2B -458 


6727 


775 


2561 


4347 


6133 


784CIP2B 459 


6730 


776 


2562 


4348 


6134 


784CIP2B 460 


6730 


777 


2563 


4349 


6135 


7 84CIP2B_461 


6730 


778 


2564 


4350 


6136 


784CIP2B_462 


6732 


779 


2565 


4351 


6137 


784CIP2B_463 


6733 


780 


2566 


4352 


6138 


784CIP2B_464 


6737 


781 


2567 


4353 


6139 


784CIP2B_4 65 


6745 


782 


2568 


4354 


6140 


784CIP2B_466 


6751 


783 


2569 


4355 


6141 


784CIP2B 467 


6754 


784 


2570 


4356 


6142 


784CIP2B_468 


6758 


785 


2571 


4357 


6143 


784CIP2B 469 


6761 


786 


2572 


4358 


6144 


784CIP2B__470 


6765 


787 


2573 


4359 


6145 


784CIP2B_471 


6768 


788 


2574 


4360 


6146 


784CIP2B 472 


6773 


789 


2575 


4361 


6147 


784CIP2B_473 


6776 


790 


2576 


4362 


6148 


784CIP2B_474 


6796 


791 


2577 


4363 


6149 


784CIP2B_4 75 


6798 


792 


.2578 


4364 


6150 


784CIP2B 476 


6823 


793 


2579 


4365 


6151 


784CIP2B 477 


6825 


794 


2580 


4366 


6152 


784CIP2B_478 


6826 


795 


2581 


4367 


6153 


784CIP2B_479 


6839 


796 


2582 


4368 


6154 


784CIP2B_480 


6844 


797 


2583 


4369 


6155 


784CIP2B_482 


6849 


798 


2584 


4370 


6156 


784CIP2B_4 83 


6854 


799 


2585 


4371 


6157 


784CIP2B_484 


6857 


800 


2586 


4372 


6158 


784CIP2B 485 


6861 


801 


2587 


4373 


6159 


784CIP2B_486 


6873 


802 


2588 


4374 


6160 


784CIP2B 487 


6875 


803 


2589 


4375 


6161 


784CIP2B_488 


6877 
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SEQ ID NO: 


6eq id 


SEQ ID NO: 


SEQ ID 


rliui Jl ty 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket numhAr 


NO • in 


length 


full- 


nucleotide 


of contig 


corresponding 


U S S - N . 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488 725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




804 


2590 


4376 


6162 


784CIP2B__489 


6880 


805 


2591 


4377 


6163 


784CIP2B 490 


6885 


806 


2592 


4378 


6164 


784CIP2B_491 


6890 


807 


2593 


4379 


6165 


784CIP2B 492 


£890 " 


808 


2594 


4380 


6166 


784CIP2B 493 


6894 


809 


2595 


4381 


6167 


784CIP2B 494 


1 6901 


810 


2596 


4382 


6168 


784CIP2B_495 


6904 


811 


2597 


4383 


6169 


784CIP2B_496 


6907 


812 


2598 


4384 


6170 


784CIP2B 497 


6914 


813 


2599 


4385 


6171 


784CIP2B 498 


6917 


814 


2600 


4386 


6172 


784CIP2B 499 


6923 


815 


2601 


4387 


6173 


784CIP2B 500 ' 


6929 " 


816 


2602 


4388 


6174 


784CIP2B 501 


6931 


817 


2603 


4389 


6175 


784CIP2B 502 


6935 


818 


2604 


4390 


6176 


784CIP2B 503 


6940 


819 


2605 


4391 


6177 


784CIP2B 504 


6945 


820 


2606 


4392 


6178 


784CIP2B_505 


6946 


821 


2607 


4393 


6179 


784CIP2B_506 


6947 


822 


2608 


4394 


6180 


784CIP2B__507 


694 9 


823 


2609 


4395 


6181 


784CIP2B 508 


6959 


824 


2610 


4396 • 


6182 


7B4CIP2B 509 


6960 


825 


2611 


4397 


6183 


784CIP2B 510 


6962 


826 


2612 


4398 


6184 


784CIP2B_511 


6963 


827 


2613 


4399 


6185 


784CIP2B 512 


6967 


828 


2614 


4400 


6186 


784CIP2B 513 


6983 


829 


2615 


4401 


61B7 


784CIP2B_514 


6988 


830 


2616 


4402 


6188 


784CIP2B 515 


6996 


831 


2617 


4403 


61B9 


784CIP2B 516 


7003 


832 


2618 


4404 


6190 


784CIP2B_517 


7016 


833 


2619 


4405 


6191 


784CIP2B_518 


7017 


634 


2620 


4406 


6192 


784CIP2B 519 


7025 


835 


2621 


4407 


6193 


784CIP2B_520 


7025 


836 


2622 


4408 


6194 


784CIP2B_521 


7025 


837 


2623 


4409 


6195 


784CIP2B 522 


7050 


83B 


2624 


4410 


6196 


784CIP2B_523 


7051 


839 


2625 


4411 


6197 


784CIP2B__524 


7055 


840 


2626 [ 


4412 


6198 


784CIP2B_525 


7060 


841 


2627 


4413 


6199 


784CIP2B 526 


7064 


842 


2628 


4414 


6200 


784CIP2B^527 


7067 


843 


2629 


4415 


6201 


784CIP2B 528 


7071 


844 


2^30 


4416 


6202 


784CIP2B_529 


7072 


845 


2631 


4417 


6203 


7S4CIP2B_53 0 


7073 


846 


2632 


4418 


6204 


784CIP2B_531 


7076 


847 


2633 


4419 


6205 


784CIP2B 532 


7088 


848 


2634 


4420 


6206 


784CIP2B 533 


7089 


849 


2635 


4421 


6207 


784CIP2B__534 


7091 


850 


2636 


4422 


6208 


784CIP2B_535 


7091 ""- 


851 


2637 


4423 


6209 


784CIP2B 536 


7104 


852 


2*38 


4424 


6210 


784CIP2B 537 


7105 


853 


2639 


4425 


6211 


784CIP2B_538 j 


7105 


854 


2640 


4426 


6212 


784CIP2B_539 


7109 


855 


2641 


4427 


6213 


784CIP2B_540 


7109 


856 


2*42 


4428 


6214 


784CIP2B_541 


7119 


857 


2643 


4429 


6215 


784CIP2B_542 


7120 


858 


2644 


4430 


6216 


784CIP2B_543 


7121 


859 


2645 


4431 


6217 


784CIP2B_544 


7126 


860 


26U£ 


4432 


6218 


784CIP2B_545 


7127 


861 


2647 


4433 


6219 


784CIP2B_546 


7130 


862 


2648 


4434 


6220 


784CIP2B__547 


7131 


863 


2649 


4435 


6221 


784CIP2B_548 


7144 


864 


2650 


4436 


6222 


784CIP2B 549 


7159 


865 


2651 


4437 


6223 


784CIP2B_550 


7163 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority- 


SEQ ID 


of full- 


NO : of 


of contig 


NO : 


docket number_ 


NO: in 


length 


ruix- 


nucleotide 


of contig 


corre sponding 


U.S .S .N. 


IluCioU U 1UC 


length 


CA/tll anna 

sequence 


pept ide 


SEQ ID NO : in 


09/488, 725 


DOUUCUVC 


pep c xue 




sequence 


priority 












sppl icst ion 




866 


2652 


4438 


6224 


/OftLXirAB DDX 


7175 


867 


2653 


4439 


6225 


/Oftl_XtrZB DDA 


/Xbo 


868 


2654 


4440 


6226 


(OHLIrZo bbo 


7189 


869 


2655 


4441 


fi777 

D£6 / 


7A4PTP9R CCA 
fOftl_J..rZO DDft 


/J.3U 


870 


2656 


4442 


fi77 ft 
OZZO 


7ftdPTD7t» ccc 
/□ftV-XrAB DDD 


7191 


871 


2657 


4443 




TUdrTDTU CCC 
/Oftl_X.rZB DDb 


/A03 


872 


2658 


4444 


6230 


TfldPTOOD ceo 


/AU4 


873 


2659 


44/c 

ft ftftD 




fOftl — Lrza bba 


7208 


874 


2660 


44AA 
ftftftO 




7ftdrTD')Q CCQ 
/ 04Llr/D DDD 


7209 


875 


2661 


A A A 7 
ftftft / 


fi9V* 


7P.4PTD7P ccft 

/ BftLXr 40 bbU 


7210 


876 


2662 


A AAR 
ft "Aft O 


CO"* A 


fOftl~X.rAB boX 


7216 


877 




ftftft:? 


OA J 0 


1 Q A HTDTD CCO 


7221 


878 






DZJD 


/oftUXrAB jbJ 


723 0 


Q7Q 




4451 


6237 


/oftUlPAB 564 


7237 


880 


2666 


ft ft DZ 


COT D 

bz.3 0 


/OftUlPAB DO D 


724 0 


881 


2 667 






7QAPTD*5T3 CCC 

/oftL.lr'AB boo 


7245 


882 


2668 


A A CA 
ft ftDft 


OAftU 


TftdrTOTU CCO 
(O^tlr^D DO/ 


72 50 


883 


2669 


A dCC 


OAftX 


TPjI PTDTQ CCQ 

foftulFAB boo 


7251 


884 


2670 


A dCC 
ft ft 3D 


bZ ft Z 


/OftL-lPAB DOS 


7255 


885 


2671 


A A C7 
ft ft 3 / 


1 COAT 

bAft J 


/B4LIPAB__b /Q 


7260 


886 


«o U 


A A CO 
ft ft DO 


C 7 A A 

bZftft 


/ oftUlr ZB DYl 


7265 i 


887 


AO / J 


A A CQ 
ftft 37 


6245 


7B4CIP2B_572 


7268 


888 


2674 


A ACfl 
ft ft OU 


OdC 

bZftb 


/oftUXVAB b /6 


7275 


889 


2675 


A A CI 
ftft Dl 


b Aft / 


/ BftCLPAB D /4 


7279 


890 


2676 


A A CO 

ft ft oz 


CO A C 
O Aft O 


/ oftCIr AB_D 75 


7283 


QQI 

D7i 


AO / / 


ft ft O J 


6249 


784CIP2B__57o 


7283 


0:7A 


Afa / O 


4464 


6250 


7 84CIP2B__577 


7287 






ft 455 


6251 


784CIP2B_578 


7301 


894 


•icon 
a b o u 


A A CC 


6252 


784CIP2B 579 


7308 




5AOl 


ftftO / 


6253 


TQjiPTniB Iran 

784CIP2B_580 


7308 


896 


2682 


A ACQ 
ft ft DO 


6254 


/o4CIPaB__5o1 


7309 


897 


«DOJ 


A A C Q 

ft ft by [ 


c'occ 


784CIP2B__582 


7319 


398 


2684 


A A 7 A 

ftft /u 


b Abb 


7 B 4 CI PAB^b B 3 


7320 




qcqc 


A A T1 
4ft /X 


6257 


784CIP2B__584 


7326 


900 


3 CP 


A A 1"> 
ftft /A 


6258 


7B4CIP2B_5B5 


7326 




zo a r 


4473 


6259 


784CIP2B_586 


7334 


902 


3£ftft 


A A 1A 
ftft /ft 


OCA 
bzbU 


784CIP2B_587 


7337 


903 




A A "7 C 
ftft / _) 


6261 


7B4CIPAB > _dBB 


7339 


904 


2690 


A Aid 
ftft /O 


6262 


/ o4uIPaB_do9 


7344 


905 


2691 


4477 


' 6263 


/aftLlFAB biJU 


7355 


906 


2692 


ftft /o 


6264 


7B4CIP2B 591 


7363 


907 


26 93 


A A 7 Q 

ftft /I* 


b^bb 


/B4CIP2B 532 


7363 


908 


2694 


44 80 


OCC 


TQ/I PTTJOa CQ1 

/04LJ.i'40__53J 


/Job 


909 


2695 


A A fi 1 
ft ft 0 X 


oAb / 


/oftCIP^B bl?ft 


7368 


910 


2696 


4482 




ifldrTWn cqc 
/OftL-lrAB 973 




911 


2697 


AAQ'i 

ftft 00 


coco 
b Ab y 


TfljIPTDOtJ CQC i 

/ 0 ft L. J. FAB by 0 


/O /A 


912 


2698 


4484 


OA / U 


•7P, AfTDOTJ CQQ 

/ Oftv^XrAB oy y 


TX 1 C 
/D /D 


913 


2699 


44P.C. 


b A / -L 


TflAPTD"5n enn 

/04LlfZD bUU 


/DOX 


914 


2700 


ft ft Ob 


bz /z 


/ 0 4 L.X r <dB_ 0 U X 


Ma J 


915 


2701 


4407 
ft ft a f 


CI TT. 
OA / J 


784CIP2B 602 


7387 


916 


2702 


A A Aft. 
ft ft OO 


an a 
OA /4 


/o4LXrAB oUi 


/d9X 


917 


Z / UJ 


A A a Q 


6275 


784CIP2B 604 


7393 


918 


2704 


4490 


6276 


784CIP2B 605 


7395 


919 


2705 


4491 


6277 


784CIP2B_606 


7397 


920 


2706 


4492 


6278 


784CIP2B_607 


7399 


921 


2707 


4493 


6279 


784CIP2B_608 


7405 


922 


2708 


4494 


6280 


784CIP2B_609 


7406 


923 


2709 


4495 


6281 


784CIP2B_610 


7406 


924 


2710 


4496 


6282 


784CIP2B_611 


7409 


925 


2711 


4497 


6283 


784CIP2B_612 


7410 


926 


2712 


4498 


6284 


784CIP2B 613 


7411 


927 


2713 


4499 


6285 


784CIP2B 614 


7417 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U . S . S .N . 


nucleotide 


length 


sequence 


peptide 


SEQ ID MO: in 


09/488, 725 


sequence 


peptide 




sequence 


priority 






sequence 






annl 1 fflh 1 OT1 
G w M -L XvO U lUil 




928 


2714 




DiOD 


fO*i\^Xtr^D OXD 


74 18 


929 


2715 


450 1 


CO Q O 


OR/lPTDOTl CI C 
IO t ±\~XirZ,0 OXO 


7421 


don 
yj u 


O O 1 e 
Z / ib 


4502 


62 8 8 


/Oltlf ifl OX / 


7422 


931 


O 1 1 *7 
Z fl. f 


*± _>U J 


COQQ 


7ft^rTB9R Ci fl 


7422 


932 


0 77 ft" 


A. 1f\A 




7RdPTP9R fil Q 
/OILXr^ Q O X 7 


7423 


933 


2719 


A Cfl C 
f* DUD 




7fi/irTP9H con 


7424 


934 


2720 


4DU b 


bzy z 


7R/irTP0Pi son 


742S 

it zo 


Q O C 

935 


OOOI 


*±DU / 


J 


7A4PTP0PI fiOO 


742 7 


936 


2722 


1DU 0 


C O Q A 
bzyfk 


oh^ptpow fioo 

'Oivlfb D_Q Z O 


740 0 
/ 1 z 0 


937 


2723 


4509 


CO QC 

bzy b 


OQAfTDOia COA 




938 


2724 


4D1U 


bz y b 






939 


2725 




bzy / 


TOAPTDOU COC 




940 


2726 


4512 


oz y a 


TQ/lPTTJ^Tl COO 
/Ofsl~xJrZ.D bZ / 


7 A 7 Q 

/*^y 


941 


2727 


4513 


6299 


/o4Ciirzi3_b2o 


0 a a n 


942 


2728 


4514 


6300 


TO/IPTTIOD COO 

/o4L.J.FZi3 bZy 


OA AO 


943 


2729 


4515 


63 01 


TDJIPTDOD COn 

/ o4uJ.i J ZrJ b iU 


O A Cft 


944 


2730 


4516 


6302 


/ O 4 0* 1 Jr 2 15__0 D 1 


7451 


94 5 


2731 


4517 


6303 


TO/ir'TftOD CIO 


OA CO 
/ «± DZ 


946 


2732 


4518 


6304 


784CIP2B_b33 


74 54 


947 


2733 


4519 


63 05 


784CIP2B__634 


7457 


948 


2734 


4520 


63 06 


784CIP2B_o3b 


74 59 


949 


2735 


4521 


63 07 


784CIP2B_636 


74 61 


950 


2736 


4522 


63 08 


/o4ClPzn_bj / 


O A c 0 


951 


2737 


4523 


6309 


784CIP2B__o3 8 


74 66 


952 


2738 


4524 


6310 


784CIP2B_b j y 


^ O A C G 

1 h by 


953 


2739 


4525 


6311 


/84CxPzB__b40 


74 73 


954 


2740 


4526 


63 12 


/84t-lPzn__b4x 


74 81 


955 


2741 


4527 


6313 


/84C,xP2B_b42 


7482 


956 


2742 


4528 


63 14 


/ o4CJLPz£J__b4.5 


OA QO 

/ *t 0 z 


957 


2743 


4529 


6315 




74 QO 


958 


2744 


453 0 


£■51 C 


7QAPTD7H CflC 
/ OftL.XjrZO__Dfi D 


74 85 


959 


2745 


453 1 


6317 


/o4(_xjb'Zn bub 


OA QC 


oVft 

960 


2746 


4532 


CI 1 o 
oj xo 


7DAPTD7H £A7 


74 87 


961 


2747 


4533 


6319 


/D4t»lPzD b4 o 


OA Q 1 


962 


2748 


4534 


CO O f\ 


7tJArTBOU CA Q 


74 92 


963 


2749 


4535 


6321 


/o4LXPzo bDU 


OA QA 


964 


2750 


4536 


cToo 

bs IZ 


— 7QAPTP0D rfci 

/b4LlrZo bDX 


OA QQ 


965 


2751 


4537 


63 23 


TO/nTDOU ceo 

/ 84 v-xtfZo^bDZ 


O C OA 


966 


2752 


453 8 


CIO A 


7a4PTDOU ceo 
/ o *t ^_ X ir Z D O D J 


/DUO 


967 


2753 


4539 


6325 


TQ4PTD7D CCJ 
/OffcLXJrZf} Oj4 


TCI C 
/ D XD 


SVo 

968 


2754 


4540 




OQAf"*Tt>OQ CCC 
/O^L.Xh'Zo ODD 


7518 


969 


2755 


4541 


63 27 


OQA/"»T'DOQ CCC 
/o4v_.XFZd bDO 


7C 1 O 
(317 


970 


2756 


a cz.a o ' — 

4542 


6328 


TQ^PTtSID CCO 
/l54L.XPzi5_ - bD / 


/ DZX 


971 


one -7 

2757 


4543 


6329 


7DAPTD7a £CQ 

/ O 41 ir zo_b -3 O 


7EOQ 


972 


2758 


4544 


oiJU 




71^0 O 
/ SO Z 


973 


2759 


4545 




oo^r'TTJon ccft 


/ D J J 


974 


2760 


4546 


c o i o 


oft^r'TTJOia cci 

/ O * v, X c Z D__D O X 


/ 3j D 


975 


2761 


4547 


6333 


oo/ifTooia ceo 

/DflL.Xr'ZD D DZ 


7 C4 C 


976 


2762 


4548 


ZTTa 

6334 


TO/IPTDIQ CCO 

/ o4Cxi'zi4_bb J 


trTS 


977 


2763 


454 9 


cTTc 

6335 


7QAPT07Q CCA 


7C CO 
/ D DZ 


978 


5a?5 

2764 


4550 


eoTe 

6336 


/o4u±P2i4 — boo 


OCCA 


979 


2765 


4551 


6337 


784CIP2B boo 


OCCO 
/DO / 


1 Q Q ft 


2766 


Si 3D Z 


633 8 


7"84Crp"2B ^7 


7569 


981 


2767 


4553 


6339 


784CIP2B 668 


7575 


982 


2768 


4554 


6340 


784CIP23_669 


7576 


983 


2769 


4555 


6341 


784CIP2B_670 


7577 


984 


2770 


4556 


6342 


784CIP2B_671 


7579 


985 


2771 


4557 


6343 


784CIP2B_672 


7582 


986 


2772 


4558 


6344 


784CIP2B_673 


7587 


987 


2773 


4559 


6345 


784CIP2B_674 


7589 


| 988 


2774 


4560 


6 , 346 - 


784CIP2B_675 


7597 


989 


2775 ■ 


4561 


6347 


784CIP2B 676 


7597 
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SEQ ID NO: " 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


CEO TD TJD • 

of contig 

nucleotide 

sequence 


cert t r> 
NO : 

of contig 

peptide 

sequence 


Priority 
uocxec nuntDer_ 
i»wi. LcspunuinQ 
SEO TD NO ■ i n 

priority 
application 


SEQ ID 
NO: in 


990 


2776 


4562 


6348 


784CIP2B 677 


7609 


991 


2777 


4563 


6349 


784CIP2B 678 


7609 


992 


2778 


4564 


6350 


784CIP2B 679 


7609 


993 


2779 


4565 


6351 


784CIP2B 680 


7613 


994 


2780 


4566 


6352 


784CIP2B 681 


7623 


995 


2781 


4567 


6353 


784CIP2B 682 


7629 


996 


2782 


4568 


6354 


784CIP2B 683 


7630 


997 


2783 


4569 


6355 


784CIP2B 684 


7633 


998 


2784 


4570 


6356 


7B4CIP2B 685 


7635 


999 


| 2785 


4571 


6357 


784CIP2B 686 


7638 


1000 


2786 


4572 


6358 


784CIP2B 687 


7639 


1001 


2787 


4573 


6359 


784CIP2B 688 


7646 


1002 


2788 


4574 


6360 


784CIP2B 689 


7647 


1003 


2789 


4575 


6361 


784CIP2B 690 * 


764 8 


1004 


2790 


4576 


6362 


784CIP2B 691 


- 7658 


1005 


2791 


4577 


63*3 


784CIP2B 692 


7664 


1006 


2792 


4578 


6364 


784CIP2B 693 


7664 


1007 


2793 


4579 ■ 


6365 


784CIP2B 695 


7674 


1008 


2794 


4580 


6366 


784CIP2B 696 


7675 


1009 


2795 


4581 


6367 


784CIP2B 697 


7676 


1010 


2796 


4582 


6368 


784CIP2B 698 


76 81 


1011 


2797 


4583 


6369 


784CIP2B 699 


768 8 


1012 


2798 


4584 


6370 


784CIP2B 700 


7693 


1013 


2799 


4585 


6371 


784CIP2B 701 


7694 


1014 


2800 


458^ 


! 6372 


784CIP2B 702 


7715 


1015 


2801 


4587 


6373 


784CIP2B 701 


7716 


1016 


2802 


4588 


6374 


784CIP2B 704 


7718 


1017 


2803 


45B9 


6375 


784CIP2B 705 


7721 


1018 


2804 


4590 


6376 


784CIP2B 706 


7723 


1019 


2805 


4591 


6377 


784CIP2B 707 


7729 


1020 


2806 


4592 


6378 


784CIP2B 708 


7733 


1021 


2807 


4593 


6379 


784CIP2B 709 


7735 


1022 


2808 


4594 


6380 


784C1P2B 710 


774 1 


1023 


2809 


4595 


6381 


784CIP2B 711 


7743 


1024 


2810 


4595 


6382 


784CIP2B 712 


774 8 


1025 


2811 


4597 


6383 


784CIP2B 713 


7749 


1026 


2812 


4593 


63 84 


784CIP2B 714 


7750 


1027 


2813 


4599 


6385 


784CIP2B 715 


7757 


1028 


2814 


4600 


6386 


784CIP2B 716 


7759 


1029 


2815 


■ 4601 


6387 


784CIP2B 717 ! 


7760 


1030 


2816 


4602 


6388 


784CIP2B 718 


7760 


1031 


2817 


4603 


6389 


784CIP2B 719 


7764 


1032 


2818 


4604 


6390 


784CIP2B 720 


7765 


1033 


2819 


4605 


6391 


784CIP2B 721 


7766 


1034 


2820 


4606 


6392 


784CIP2B 722 


7767 


1035 


2821 


4607 


6393 


784CIP2B 723 


7769 


1036 


2822 


4608 


6394 


784CIP2B 724 


7770 


1037 


2823 


4609 


6395 


784CIP2B 725 


7774 


1038 


2824 


4610 


6396 


784CIP2B 726 


7779 


1039 


2825. 


4611 " 


£397 


784CIP2B 727 


77B1 


1040 


2826 


4612 


6398 


784CIP2B 728 


7732 


1041 


2827 


4613 


6399 


784CIP2B 729 


7783 


1042 


2828 


4614 


6400 


784CIP2B 730 


7787 


1043 


2829 


4615 


6401 


784CIP2B_731 


7792 


1044 


2830 


4616 


6402 


784CIP2B 732 


7795 


1045 


2831 


4617 


6403 


784CIP2B_733 j 


7801 


1046 


2832 


4618 


6404 


784CIP2B_734 


7807 


1047 


2833 


4619 


6405 


784CIP2B_735 


7808 


1048 


2834 


4620 


6406 


784CIP2B_736 


7819 


1049 


2835 


4621 


6407 


784CIP2B_737 


7824 


1050 


2836 


4622 


6408 


784CIP2B 738 


7826 


1051 


2837 


4623 


6409 " 


" 784CIP2B 739 


7829 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide . 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S.S .N. 
09/488,725 


1052 


■" 2838 


4624 


6410 


784CIP2BJ740 


7832 


1053 


2839 


4625 


6411 


784CIP2B 741 


7839 


1054 


2840 


4626 


6412 


j 784CIP2B_743 


7847 


1055 


2841 


4627 


6413 


784CIP2BJ744 


7B48 


1056 


2842 


4628 


6414 


784CIP2B 745 


7853 


1057 


2843 


4629 


6415 


784CIP2BJ746 


7854 


1058 


2844 


4630 


6416 


784CIP2B_747 


7856 


1059 


2845 


4631 


6417 


784CIP2B_74 8 


7862 


1060 


2846 


4632 


6418 


784CIP2B 749 


7865 


1061 


2847 


4633 


6419 


784CIP2B_750 


7874 


1062 


2848 


4634 


6420 


784CIP2B_75l 


7877 


1063 


2849 


4635 


6421 


784CIP2B 752 


7880 


1064 


2850 


4636 


6422 


784CIP2B 753 


7882 


1065 


2851 


4637 


6423 


784CIP2B 754 


7884 


1066 


2852 


463B 


6424 


784CIP2B_755 


7886 


1067 


2853 


4639 


6425 


784CIP2B__756 


7888 


1068 


2854 


4640 


6426 


784CIP2BJ757 


7889 


1069 


2855 


4641 


6427 


784CIP2B 758 


7901 


1070 


2856 


4642 


6428 


784CIP2BJ759 


7910 


1071 


2857 


4643 


6429 


784CIP2B_760 


7911 


1072 


2858 


4644 


6430 


784CIP2BJ761 


7921 


1073 


2859 


4645 


6431 


784CIP2B 762 


7923 


1074 


2860 


4646 


6432 


784CIP2B_763 


7924 


1075 


2B61 


4647 


6433 


784CIP2BJ764 


7925 


1076 


2862 


! 4648 


6434 


784CIP2B 765 


7928 


1077 


2863 


4649 


6435 


784CIP2B 766 


7929 


1078 


2864 


4650 


6436 


784CIP2B_767 


7930 


1079 


2865 


4651 


6437 


784CIP2B 768 


7934 | 


1080 


2866 


4652 


6438 


784CIP2B_769 


7938 


1081 


2867 


4653 


6439 


784CIP2BJ770 


7942 


1082 


2868 


4654 


6440 


784CIP2B 771 


7945 


1083 


2869 


4655 


6441 


784CIP2BJ772 


7946 


10B4 


2870 


4656 


6442 


784CIP2B 773 


7948 


1085 


2871 


4657 


6443 


784CIP2B_774 


7951 


1086 


2872 


4658 


6444 


784CIP2B_775 


7952 


1087 


2873 


4659 


6445 


784CIP2B 776 


7953 


1088 


2874 


4660 


6446 


784CIP2B 777 


7954 


1089 


2875 


4661 


6447 


784CIP2B 778 


7957 


1090 


2876 


4662 


6448 


784CIP2BJ779 


7958 


1091 


2877 


4^3 


6449 


784CIP2BJ7B0 


7961 


1092 


2878 


4664 


6450 


784CIP2B_7B1 


7965 


1093 


2879 


4655 


6451 


784CIP2B 782 


7966 


1094 


2880 


4656 


6452 


784CIP2B_783 


7979 


1095 


2881 


4667 


6453 


784CIP2BJ7B4 


7986 j 


1096 


2882 


4668 


6454 


784CIP2B 785 


7986 


1097 


2883 


4669 


6455 


784CIP2B 786 


7988 


1098 i 


2884 


4670 


6456 


784CIP2B 787 


7991 


1099 


2885 


4671 


" £457 


784CIP2B 788 


7992 


1100 


2886 


4672 


6458 


784CIP2B 789 


7992 


1101 


2887 


4673 


6459 


784CIP2B_790 


7992 


1102 


2888 


4674 


6460 


784CIP2B 791 


7992 1 


1103 


2889 


4675 


6461 


784CIP2B 792 


8003 


1104 


2890 


4676 


6462 


784CIP2B 793 


8014 


1105 


2891 


4677 


6463 


784CIP2B_794 


8015 ; 


1106 


2892 


4678 


6464 


784CIP2B_795 


8016 


1107 


2893 


4679 


6465 


784CIP2BJ796 


8017 


1108 


2894 


4680 


6466 


784CIP2B_797 


8019 j 


1109 


2895 


4681 


6467 


784CIP2B 798 


8020 


1110 


2896 


4682 


6468 


784CIP2B_799 


8022 


1111 


2897 


4683 


6469 


784CIP2B 800 


8022 


1112 


2898 


4684 


6470 


784CIP2B_801 


8028 


1113 


2899 


4685 


6471 


784CIP2B 802 


8030 
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SEQ ID 


O T"7^\ *t* T\ 

SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


r»f ful 1 


NO : of 


of contig 


NO : 


docket number^ 


NO: in 


len^th 


ful l - 
t ux JL - 




oc conciy 


corresponding 


U.S . S .N. 


nucleotide 






yep L lUc 


err) TT1 MD • \ n 
o Dy XU viKJ . J. 11 


uy/ 4oo , t 


sequence 


npnt" "5 rip 






r> v- 1 OT"i t* \r 
ux x wit Y 






sequence 






application 




1114 


2900 


4686 


6472 


784CIP2B 803 


8038 


1115 


2901 


4687 


6473 


784CIP2B 804 


8042 


1116 


2902 


4688 


6474 


784CIP2B 805 


8045 


1117 


2903 


4689 


6475 


784CIP2B 806 


8 045 


1118 


2904 


4690 


6476 


784CIP2B 807 


8046 


1119 


2905 


4691 


6477 


784CIP2B 808 


8047 


1120 


2906 


4692 


6478 


784CIP2B 809 


8051 


1121 


2907 


4693 


6479 


784CIP2B 810 


8059 


1122 


2906 


4694 


6480 


784CIP2B 811 


8064 


1123 


2909 


4695 


6481 


784CIP2B 812 


8069 


1124 


2910 


4696 


6482 


784CIP2B 813 


8074 


1125 


2911 


4697 


6483 


784CIP2B 814 "*" 


8077 


1126 


2912 


4698 


6484 


7B4CIP2B fllR 


8078 


1127 


2913 


4699 


6485 


784CIP2B fll6 


8079 


1128 


2914 


4700 


6486 


7B4PTP7R ft! 7 


8084 


1129 


2915 


4701 


6487 


7B4CIP2R 818 


8088 


1130 ' 


2916 


4702 


648 8 


1 784PTP2B 81 Q 


8090 1 


1131 


2917 


4703 


6489 


7B4PTP7R 850 




1132 


2918 


4704 


6490 


784PTP7R 891 


8099 


1133 


2919 


4705 


6491 


784PTP7R 897 


8(1 QQ 


1134 


2920 


4706 


6492 


784PTP7R R71 




1135 


2 92 1 


4707 


6493 




0J.UZ 


1136 


2922 


4708 


6494 


p 7Rdr , TP7U ft*)? 


OJ.UJ 


113 7 


2923 


4709 


6495 




OlU j 


113 6 


2924 


4 710 


6496 




q i nil 


113 9 


2925 


4711 


6497 




oi no 
oiuo 


1140 


2926 


4712 


6498 




o ± ± U 


1141 


2927 


4713 


6499 


784PTP7R am 


81 1 


1142 


292 8 


4 714 


6500 


784PTP7R 871 


81 1 7 
oil / 


1143 


2929 


4 715 


6501 


7R4PTP2R R77 


Rl 71 
Ol^J 


1144 


293 0 


4716 


6502 


784PTP7R 817 


813 0 


1145 


293 1 


4717 


6503 


7R4PTP7R 814 


813 0 


1146 


2932 


4718 


6504 


784PTP7R R1S 


8143 


1147 


293 3 


4 719 


6505 


784PTP3R Rlfi 


8143 


1148 


2934 


4720 


6506 


784PTP3R R17 


8154 


1149 


2935 


4 721 


6507 


7R4PTP3R R18 


0133 


1150 


2936 


4722 


6508 


784PTP7R 819 


8162 


1151 


293 7 


4723 


6509 


784CIP2B 840 


■91^3 


1152 


2938 


4724 i 


'6510 


784PIP2B 841 


8172 


1153 


2939 


4725 


6511 


7B4PIP7B 842 


8173 


1154 


2940 


4726 


6512 


784CIP2R 843 


8179 


1155 


2941 


4727 


6513 


784CIP2B 844 


8182 ' 


1156 


2942 


4728 


6514" 


784CIP2B 845 


8183 


1157 


2943 


4 729 


6515 


784CIP2B 846 


8184 


1158 


2944 


4730 


6516 


784PTP7R H47 


8185 


1159 


294 5 


4 731 


6517 


784PTP7B 843 


8187 


1160 


2946 


4732 


6518 


784CIP2B 849 


8188 


1161 


2947 


4733 


6519 


784PIP2B 8 SO 


8190 


1162 


2946 


4 734 


6520 


784PTP7R RSI 


8190 


1163 


2949 


4735 


6521 




8192 


1164 


2950 


4736 


6522 


784PTP7R RS3 


8193 


1165 


2951 


4737 


6523 


784PTP7R RCA 


8197 


1166 


2952 


4738 


6524 


784CIP2B 855 


8197 


1167 


2953 


4739 


6525 


784CIP2B_856 


8199 


1168 


2954 ! 


4740 


6526 


784CIP2B 857 


8202 


1169 


2955 


4741 


6527 


784CIP2B 858 


8203 


! 1170 


2956 


4742 


6528 


784CIP2B_859 


8208 


1171 


2957 


4743 


. 6529 


784CIP2B 860 


8209 


1172 


2958 


4744 


6530 


784CIP2B 861 


8211 


1173 


2959 


4745 


6531 


784CIP2B_862 


8214 


1174 


2960 


4746 


6532 


784CIP2B_863 


8217 


1175 


2961 


4747 


6533 


784CIP2B 864 


8223 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SBQ ID 


Priority 


SBQ ID 


or tu±±- 


NO : of 


of contig 


NO : 


docket number^ 


NO: in 


lencffc h 


lull - 


nucleotide 


of contig 


cor re spondi ng 


U.S. S.N. 




length 


sequence 


peptide 


b-by 1U v*\J : in 


09/488, 725 










pri ur iLy 






seoupnrp 










1176 


. 2962 


4748 


6534 




8224 


1177 


2963 


4749 


6535 


f OlClr^o SCO 


8226 • 


1178 


2964 




6536 


(OtLlrtiD Ob / 


6227 


1179 


2965 


t / 91 


6537 




8229 


1180 




A "7 C7 




/ (>4Llr«o 0 07 


8232 


1181 


2967 


A 7 C*J 


D3J 7 


/ D*4Llf O/U 


8236 


1182 


2968 


A 7 EA 


gra n 

D31 U 


7Q4PTOOB Q7T 
ro^LlriiO o/l 


8239 


1183 


£7Q7 




CCA 1 


7QZPTD7S Q77 

/B^Llr^B 872 


8244 


X J. a *i 


* 3 / u 


4756 




/B4:t.Ir2B_o73 


8245 


1185 


«7(1 


4 / b / 


CCA *i 

b d»j 


/o4L.lf^o 874 


824 8 


1186 


7 Q79 


A 7 CO 

4 / be 


CCA A 


70A<^Tt3713 Q7C 


8251 


118 7 


«7 ' J 


A n c a 
4 / 37 


03%D 


TQJ^TDin one 

/o4uJLrzB_8 7o 


8253 


118 8 


"5 Q74 


^7cn 
4 /bU 


C CA £ 


7Qd fTDTD 077 

/o4L.lir2B o77 


8260 


1189 


* -? * 3 


A 7 CI 


bb4 / 


TQ/l^TmB 070 

/b^LlrZs o la 


8262 




*s y / o 


4762 


6548 


784CIP2B 879 


8268 


1 1 Q1 
1171 


£.7) I f 


4763 


654 9 


70XOXn7Tl oan 

7o4CiP2B_8 80 


8270 


117& 


OQ7R 


A 7 CA 


6550 


/04CIP2B oal 


8272 


1 17.3 


2979 


4 765 


6551 


/o4LIP2B 882 


8274 


1 1 OA 
1174 


■47DU 


4766 


6552 


784CIP2B 883 


8274 


1173 


<• Jul 


4767 


6553 


7o4Cir2B__8B4 


8275 


11 JO 


Toon 

« joZ 


4768 


6554 


784CIP2B_885 


8277 


1197 


2983 


4769 


6555 


784CIP2B^8 Bb 


8281 


1198 


2984 


a *7 in 


DDDb 


T Q ii nTnin 007 
/o4Llr2D oo/ 


8283 


1199 


£ J03 


a n 71 
4 / /I 


C CC7 

bbb / 


"7Q/1 OTmtl QOO 

/o4Llrzlfl ooo 


8289 


120 0 


Oqor 


4772 


6558 


/o4L.lr2B_8 89 


8295 


1ZU1 


2987 


4773 


6559 


784CIP2B__890 


8300 


12 02 




A 7 1 A 


6560 


7 O Jl /"» TT1 7 T» OOI 

/04L.1F2B 071 


8303 


1203 




4775 


6561 


/ o 4 CI P2 B___8 9 2 


8304 


12 04 


6?7u 




o bb^ 


/o4L.lr2B 073 


8305 


12 0 5 


« 7 7 1 


*± f 1 f 


6 563 


/o4LIP2B_8 94 


8309 \ 


1206 


£77£ 


477Q 


C CCA 

bbb4- 


70/1 PTB7D OQC 
/o4tlPzb 073 


8318 


i o n *7 

1^ u / 


2993 


4779 


6565 


784CIP2B_89b 


8319 


TJflO 

IbUt) 




4780 


6566 


784CIP2B_8 97 


8321 


12 09 


2995 


4781 


6567 


784CIP2B_898 


8322 


1210 


2996 


4782 


6568 


7B4CIP2B 899 


6323 


I'll 


TOOT 
2 7 7 / 


a tin 

4783 


6569 


784CIP2B 900 


8325 


101") 

i« i« 


1QQQ 
<i770 


A 7 O A 


6570 


784CIP2B^9U1 


8331 


1«1J 


■JQQQ 

£777 


/nor f 


6571 


784CIP2B_902 


8332 


1Z14 


innrt 
OUvU 


A "7 Q a 


6572 


784CIP2B > _903 


8333 


12 15 


JUUl 


A 7 O 7 


6573 


7 84CIP2B_904 


8335 


1216 


3 002 


4788 


6574 


784CIP2B_905 


8336 


12 1 7 


OuUJ 


A 7 q a 
4 / 07 


6575 


784CIP2B 90b ! 


8337 


12 1 6 




4790 


6576 


784CIP2B__907 


8340 


1<£ X 7 


JUv j 


A7Q1 
/ 71 


>r?n'-H 

6b /7 


7o4ClPzB 908 


8343 


1220 


JUUO 


/TOO 


6578 


70/ PTDTn QAQ 1 
/04U1P^B_7U7 


8347 


1« A 1 


OAT 


A7 QI 

H, f 70 


6579 


784CIP2B_7l0 


8349 


1222 


juuo 


A 7QA 
** 1 7^1 


6580 


7o4CIP2B__7ll 


8351 


1223 


JUU7 


AT Q C | 
/ 73 1 


65B1 




Q 1 C7 
O j3j 


1224 


JulU 


ZTqS 


6582 


784CIP2B^913 


8355 


1225 


J Ull 


A 7 Q7 

/ 7 / 


6583 


784CIr2B_7l4 


8361 


1226 


J U1Z 


A 7 Q a 
4 / 7 0 


6584 


784CIP2B_7lb 


8365 


±£Z I 


jUIj 


A 7 Q Q 
9 (37 


6585 


7B4CIP2B_91b 


8367 


1228 


3014 


4800 


6586 


784CIP2B 917 


8369 


1229 


3015 


4801 


6587 


784CIP2B_919 


8375 


1230 


3016 


4802 


6588 


784CIP2B_920 


8387 


1231 


3017 


4803 


6589 


784CIP2B_921 


8391 


1232 


3018 


4804 


6590 


784CIP2B 922 


8393 


1233 


3019 


4805 


6591 


784CIP2B_923 


8393 


1234 


3020 


4806 


6592 


784CIP2B_924 


8394 


1235 


3021 


4807 


6593 


784CIP2B_925 


8395 


1236 


3022 


4808 


6594 


784CIP2B_926 


8396 


1237 


3023 " 


4809 


6595 


784CIP2B 927 


8398 
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bby id 


bEQ ID NO : 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO • of 


oi conciy 


NO : 


docket number 


NO: in 


length 


full- 








U. S .S .N. 


nucleotide 


length 


sequence 




SEO ID NO* in 




sequence 


peptide 




sequence 


priority 






sequence 






application 




1238 


3024 


4810 


6596 


784CIP2B 928 


8402 


1239 


3025 


4811 


6597 


784CIP2B 929 


8402 


1240 


3026 


4812 


6598 


784CIP2B 930 


8405 


1241 


3027 


4813 


6599 


784CIP2B 931 


8406 


1242 


3028 


4814 


6600 


784CIP2B 932 


8409 


1243 


3029 


4815 


6601 


784CIP2B 933 


8410 


1244 


3030 


4816 


6602 


784CIP2B 934 


" " "8414 


1245 


3031 


4817 


6603 


784CIP2B 935 


8415 


1246 


3032 


4818 


6604 


784CIP2B 936 


8419 


1247 


3033 


4819 


6605 


784CIP2B 937 


6426 


1248 


3034 


4820 


£606 


784CIP2B 938 


8430 


1249 


3035 


4821 


6607 


784CIP2B 939 


8431 


1250 


3036 


4822 


6608 


784CIP2B 940 


8432 


1251 


3037 


4823 


6609 


7fi4riPPR 941 

' KJ ^ ± C O ^11 


8433 


1252 


3038 


4824 


6610 


784CIP2B 942 


8434 


1253 


3039 


4825 


6^11 


784CIP2B 943 


843 8 


1254 


3040 


4826 


6612 


7R4CTP7R 944. 




1255 


3041 


4827 


6613 


7R4PIP9R 94S 


8441 


1256 


3042 


4828 


6614 


7R4PTP9P. 94fi 


8450 


1257 


3043 


4829 


6615 


7B4CIP2R 947 


8451 


1258 


3044 


483 0 


6616 


7H4r , TP9R 94fl 


Oft 3Z 


1259 


3045 


4831 


6617 


7R4C*TP9R 949 


0 4DU 


1260 


3046 


4832 


6618 


7R4PTP2R 9^fl 


8461 


1261 


3047 


4 833 


6619 


7flAPTD*>Ja QCi 




1262 


3048 


4834 


6620 


7R4f , TP'?R 9^9 




1263 


3049 


4835 


6621 


7ft4r , TP?R 9^? 




1264 


3050 


4836 


6622 




8467 


1265 


3051 


4 837 


6623 


7R4CIP5R 9? 1 ; 


8470 


1266 


3052 


4838 


6624 


7R4PTP5R 9<^G 


8471 


1267 


3053 


4839 


6625 


7R4CTP7B 957 


84 73 


1268 


3054 


4840 


6626 


7R4f , TP9R 95R 


8474 


1269 


3055 


4841 


6627 


784CIP2B 959 


8475 


1270 


30££ 


4842 


6628 


784CIP2B 960 


8476 


1271 


3057 


4843 


6629 


784CIP2B 961 


8480 


1272 


3058 


4844 


6630 


784CIP7B 96? 


8482 


1273 


3059 


4845 


6631 


7R4CIP9R 9fi'-l 


8482 


1274 


3060 


4846 


6632 


784CIP2B 964 


8486 


1275 


3061 


4847 


6633 


784CIP2B 965 


8488 


1276 


3062 


4848 


6634 


784CIP2B 966 


8492 


1277 


3063 


4849 


6635 


784CIP2B 967 


8494 


1278 


3064 


4850 


6636 


784CIP2B 968 


8496 


1279 


3065 


4851 


6637 


784CIP2B 969 


8497 


1280 . 


3066 


4852 


6638 


784CIP2B 970 


8499 


1281 


3067 


4853 


6639 


784CIP2B 971 


8513 


1282 


3068 


4854 


6640 


784CIP2B 972 


"8522 ~ 


1283 


3069 


4855 


6641 


784CIP2B 973 


8526 


1284 


3070 


4856 


6642 


784CIP2B 974 


8531 


1285 


3071 


4857 


" 6643' " 


7R4rTPPR 975 


"8533 


1286 


3072 


4858 


6644 


784CIP2B 976 


"8542 


1287 j 


3073 


4859 


6^45 


7R4PTP2R 977 




1288 


3074 


4660 


6646 


7R4CIP7R 97R 




1289 


3075 


4861 


6647 


7R4PTP9R 979 


" '" 8565 


1290 


3076 


4862 


' 6648 


784CIP2B 980 


8572 


1291 


3077 


48^3 


6649 


784CIP2B_981 


8576 


1292 


3078 


4864 


6650 


784CIP2B_982 


8578 


1293 


3079 


4865 


6651 


784CIP2B_983 


8584 


12 94 


3080 


4866 


6652 


784CIP2B_984 | 


8598 


1295 


3081 


4867 


6653 


784CIP2B_985 


8602 


1296 


3082 


4868 


6654 


7B4CIP2B_986 


8604 


1297 


3083 


4869 | 


6655 


784CIP2B_987 


8609 


1298 


3084 


4870 


6656 


784CIP2B_988 


8612 


1299 


3085 


4871 




784CIP2B 989 


8637 
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SEQ ID 


G T?Pl T TV XT/"\ . 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO • o-F 


ui conuiy 


iSKJ I 


docket number^ 


NO : in 


length 


full- 


mi rO pot* "i Hp* 








nucleotide 


length 


sequence 


peptide 


SEO ID NO* in 


0Q/4flft 795 


sequence 


peptide 




sequence 


priority 






sequence 






appl i cat ion 




1300 


3086 


4872 


6658 


784CIP2B 990 


| 8640 


1301 


3087 


4873 


6659 


784CIP2B 991 


8643 


13 02 


3088 


4874 


6660 


784CIP2B 992 


8^45 


13 03 


3089 


4875 


6661 


784CIP2B 993 


8650 


1304 


3090 


4876 


6662 


784CIP2B 994 


8651 


1305 


3091 


4877 


6663 


784CIP2B 995 


8654 


1306 


3092 


4878 


6664 


784CIP2B 996 


8^55 


1307 


3093 


4879 


6665 


784C!lP2B 997 


8657 


1308 


3094 


4880 


6666 


784CIP2B 998 


8665 


1309 


3095 


4881 


6667 


784CIP2B 999 


8668 


1310 


3096 


4882 


6668 


784CIP2B 1000 


8671 """ 


1311 


3097 


4883 


6669 


784CIP2B 1001 


8672 


1312 


3098 


4884 


6670 


784CIP2B 1002 


8692 


1313 


3099 


4885 


6671 


784CIP2B 1003 


8706 


1314 


3100 


4886 


6672 


784CIP2B 1004 


87l£ 


1315 


3101 


4887 


6673 


784CIP2B 1005 


8719 


1316 


3102 


4888 


6674 


784CIP2B 1006 


8743 


1317 


3103 


4889 


6675 


784CIP2B 1007 


8764 


1318 


3104 


4890 


6676 


784CIP2B 1008 


8764 


1319 


3105 


4891 


6677 


784CIP2B 1009 


8764 


1320 


3106 


4892 


6678 


7S4CIP2B 1010 


8774 


1321 


3107 


4893 


6679 


784CIP2B 1011 


8782 


1322 


3108 


4894 


6680 


784CIP2B 1012 


8796 


1323 


3109 


4895 


6681 


784CIP2B 1013 


8827 


1324 


3110 


4896 


6682 


784CIP2B 1014 


8842 


1325 


3111 


4897 


6683 


784CIP2B 1015 


8842 


1326 


3112 


4898 


6684 


784CIP2B 1016 


8 858 


1327 


3113 


4899 


6685 


784CIP2B 1017 


8 871 


1328 


3114 


4900 


6686 


784CIP2B 1018 


8921 


1329 


3115 


4901 


6687 


784CIP2B 1019 


8927 


1330 


3116 


4902 


6688 


784CIP2B 1020 


8942 


1331 


3117 


4903 


6689 


784CIP2R 1021 


8994 


1332 


3118 


4904 


6690 


7&4CIP2B 1052 


9023 


1333 


3119 


4905 


6691 


784CIP2B 102"* 


9028 


1334 


3120 


4906 


6692 


784CIP2B 1024 


9056 


1335 


3121 


4907 


6693 


784CIP2R 1025 


9058 


1336 


3122 


4908 


6694 


784CIP2B 1026 


9P79 


1337 


3123 


4909 


6695 


784CIP2B 1027 


9079 


1338 


3124 


4910 


6696 


784CIP2B 1028 


9082 


1339 


3125 


4911 


6697 


784CIP2B 1029 


9084 


1340 


3126 


4912 


669B 


784CIP2B 1030 


9093 


1341 


3127 


4913 


6699 


784CIP2B 1031 


9101 


1342 


'" 3128 


4914 


6700 


784CIP2B 1032 


9103 


1343 


3129 


4915 


6701 


784CIP2B 1033 


9105 


1344 


3130 


4916 


6702 


784CIP2B 1034 


9151 


1345 


3131 


4917 


6703 


784CIP2B 1035 


9161 


1346 


3132 


4918 


6704 


784CIP2B 1036 


9172 


1347 


3133 


4919 


6705 


784CIP2B 1037 


9174 


1348 


3134 


4920 


6706 


784CIP2B 1038 


" 9204 


1349 


3135 


" 4921"" 


6707 


784CIP2B 1039 


9234' 


1350 


3136 


4922 


6708 


784CIP2B 1040 


9235 


1351 


3137 ™ 


4923 


6709 


78'4CIP2B"i041 


9239 


1352 


3138 


4924 


6710 


784CIP2BJL042 


925G 


1353 


3139 


4925 


4711 


784CIP2B_1043 


9276 


1354 


3140 


4926 


6712 


784CIP2B_1044 


9345 


1355 


3141 


4927 


6713 


784CIP2B_1045 


9379 


1356 


3142 


4928 


6714 


784CIP2B_1046 


9435 


1357 


3143 


4929 


6715 


784CIP2B 1047 


9437 


1358 


3144 


4930 


6716 


784CIP2B 1048 


9469 


1359 


3145 


4931 


6717 


784CIP2B 1049 


9500 


1360 


3146 


4932 


6718 


784CIP2B 1050 


9502 


1361 


3147 


4933 


6719 


784Cii?2B 1051 


9520 
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of full- 
length 
nucleotide 


SEQ ID 
NO: of 
full- 
length 


of contig 

nucleotide 

sequence 


SEQ ID 

NO : 

of contig 
peptide 


Priority 

uocKct nuuujer 

SEQ ID NO: in 


SEQ ID 
vt\j : m 

09/488 7?^ 


sequence 


peptide 
sequence 




sequence 


priority 
application 




1362 


3148 


4934 


6720 


784CIP2B_1052 


9541 


1363 


3149 


4935 


6721 


784CIP2B_1053 


9541 


1364 


3150 


4936 


6722 


784CIP2B_1054 


9548 


1365 


3151 


4937 


6723 


784CIP2B_1055 


9556 


1366 


3152 


4938 


6724 


784CIP2B 1056 


9556 


1367 


3153 


4939 


6725 


784CIP2B_1057 


9575 


1368 


3154 


4940 


6726 


784CIP2B_1058 


9589 


1369 


3155 


4941 


6727 


784CIP2B_1059 


9599 


1370 


3156 


4942 


£728 


784CIP2B_1060 


9602 


1371 


3157 


4943 


6729 


784CIP2B_1061 


9606 


1372 


3158 


4944 


6730 


784CIP2B_1052 


9622 


1373 


3159 


4945 


6731 


784CIP2B 1063 


9£23 


1374 


3160" 


4946 


£732 


784CIP2B 1064 


9646 


1375 


3161 


4947 


6733 


784CIP2B_1065 


9747 


1376 


3162 


4948 


6734 


784CIP2B_1066 


9773 


1377 


3163 


4949 


6735 


784CIP2B 1067 


9785 


1378 


3164 


4950 


6736 


784CIP2B_106B 


9801 


1379 


3l£5 


4951 


6737 


784CIP2B 1069 


9811 


1380 


3166 


4952 


6738 


784CIP2B 1070 


9843 


13B1 


3167 


'4953 


6739 


784CIP2B 1071 


9854 


1332 


3168 


4954 


6740 


784CIP2B 1072 


9854 


1383 


3169 


4955 


6741 


784CIP2B 1073 


9864 


1384 


3170 


4956 


6742 


784CIP2B 1074 


9864 i 


1385 


3171 


4957 


6743 


784CIP2B 1075 


9871 


1386 


3172 


4958 


6744 


784CIP2B 1076 


9879 


1387 


3173 


4959 


6745 


784CIP2B 1077 


9881 


1388 


3174 


4960 


6746 


784CIP2B 1078 


9885 


1389 


3175 


4961 


6747 


784CIP2B 1079 


9901 


1390 


3176 


4962 


6748 


784CIP2B 1080 


9912 


1391 


3177 


4963 


6749 


784CIP2B 1081 


9916 


1392 


3178 


4964 


6750 


784CIP2B 1082 


9921 


1393 


3179 


4965 


6751 


784CIP2B 1083 


9925 


1394 


3180 


4966 


6752 


784CIP2B_1084 


9930 


1395 


3181 


4967 


6753 


784CIP2B 1085 


9949 


1396 


3182 


4968 


6754 


784CIP2B 1086 


9951 


1397 


3183 


4969 


6755 


784CIP2B 1087 


9959 


1398 


3184 


4970 


6756 


784CIP2B 1088 


9973 


1399 


3185 


4971 


6757 


784CIP2B_1089 


9982 


1400 


3186 


4972 


6758 


784CIP23 1090 


9994 


1401 


3187 


4973 


6759 


784CIP2B 1091 


10021 


1402 


3188 


4974 


6760 


784CIP2B 1092 


10041 


1403 


3189 


4975 


S761 


784CIP2B 1094 


10067 


1404 


3190 


4976 


£762 


784CIP2B_1095 


10073 


1405 


3191 


4977 


6763 


784CIP2B 1096 


10112 


1406 


3192 


4973 


6764 


784CIP2B_1097 


10117 


1407 


3193 


4979 


6765 


784CIP2B 1098 


10132 


1408 


3194 


4980 


6766 


784CIP2B 1099 


10169 


1409 


3195 


4981 


6767 


784CIP2B 1100 


10217 


1410 


3196 


4982 


6768 


784CIP2B 1101 


10226 


1411 


3197 


4983 


6769 


784CIP2B 1102 


10232 


1412 


3198 


4984 


6770 


784CIP2B 1103 


10237 


1413 


3199 


4985 


6771 


784CIP2B 1104 


10279 


1414 


3200 


4986 


6772 


784CIP2C_1 


33 


1415 


3201 


4987 


6773 


784CIP2C_2 


271 


1416 


3202 


4988 


6774 


784CIP2C_3 " 


848 


1417 


3203 


4989 


6775 


784C1P2C 4 


849 


1418 


3204 


4990 


6776 


784CIP2C_5 


864 


1419 


3205" 


4991 


6777 


784CIP2C_6 


953 


1420 


3206 


4992 


6778 


784CIP2C_7 


980 


1421 


3207 


4993 


6779 


784CIP2C 8 


1595. 


1422 


3208 


4994 


6780 


784CIP2C 9 


1697 


1423 


3209 


4995 


6781 


784CIP2C 10 


1744 
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SEQ ID NO: 
of full- 
length 
nucleotide 


SEQ ID 
NO: of 
full- 
length 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 
peptide 


Priority 

cor resnondi na 
SEQ ID NO: in 


SEQ ID 

MO . i n 

invj : in 
U S . S .N . 
09/488 72*5 


sequence 


peptide 
sequence 




sequence 


priority 
application 




■ 1424 


3210 


4996 


6782 


784CIP2C_11 


1937 


1425 


3211 


4997 


6783 


784CIP2C 12 


1955 


1426 


3212 


4998 


6784 


784CIP2C 13 


1955 


1427 


3213 


4999 


6785 


784CIP2C_14 " 


2185 


1428 


3214 


5000 


6786 


784CIP2C 15 


2889 


1429 


3215 


5001 


6787 


784CIP2C 16 


2901 


1430 


3216 


5002 


6788 


784CIP2C 17 


2902 


1431 


3217 


5003 


6789 


784CIP2C 18 


2905 


1432 


3218 


5004 


6790 


784CIP2C_19 


2948 


1433 


3219 


5005 


6791 


784CIP2C 20 


2956 


1434 


3220 


5006 


6792 


784CIP2C 21 


2959 


1435 


3221 


5007 


6793 


784CIP2C 22 


2965 


1436 


3222 


5008 


6794 


784CIP2C 23 


2966 


1437 


3223 


5009 


6795 


784CIP2C 24 


2970 


1438 


3224 


5010 


6796 


784CIP2C 25 


2985 


1439 


3225 


5011 


6797 


784CIP2C 26 


2987 


1440 


3226 


5012 


6798 


784CIP2C 27 


2993 


1441 


3227 


5013 


6799 


784CIP2C 28 


2993 


1442 


3228 


5014 


6800 


784CIP2C 29 


3017 


1443 


3229 


5015 


6801 


784CIP2C 30 


3046 


1444 


3230 


5016 


6802 


784CIP2C 31 


3050 


1445 


3231 


5017 


6803 


784CIP2C 32 


3357 


1446 


3232 


5018 


6804 


784CIP2C 33 


3359 


1447 


3233 


5019 


6805 


784CIP2C 34 


3432 


1448 


3234 


5020 


6806 


784CIP2C 35 


343 8 


1449 


3235 


5021 


6807 


7B4CIP2C 36 


343 9 


1450 


3236 


5022 


6608 


784CIP2C 39 


3463 


1451 


3237 


5023 


6809 


784CIP2C 40 


3466 


1452 


3238 


5024 


6810 


784CIP2C 41 


3466 


1453 


3239 


5025 


6811 


784CIP2C 42 


3467 


1454 


3240 


5026 


6812 


784CIP2C 43 


3468 


1455 


3241 


5027 


6813 


784CIP2C 44 


3483 


1456 


3242 


502B 


6814 


784CIP2C 45 


3484 


1457 


3243 


5029 


6815 


784CIP2C 46 


3468 


145B 


3244 


5030 


6816 


784CIP2C 47 


3491 


1459 


3245 


5031 


6817 


784CIP2C 48 


3493 


1460 


3246 


5032 


6818 


784CIP2C 49 


3494 


1461 


3247 


5033 


6819 


784CIP2C 50 


3495 


1462 


3248 


5034 


6820 


784CIP2C 51 


3496 


1463 


3249 


5035 


6821 


784CIP2C 52 


3503 


1464 


3250 


5036 


6822 


784CIP2C 53 


3503 


1465 


3251 


5037 


6823 


784CIP2C 54 


3504 


1466 


3252 


5038 


6824 


784CIP2C 55 


3511 


1467 


3253 


5039 


6825 


784CIP2C 5.6 


3531 


1468 


3254 


5040 


6826 


784CIP2C 57 


3536 


1469 


3255 


5041 


6827 


784CIP2C_58 


354£ 


1470 


3256 


5042 


6828 


784CIP2C 59 


3548 


1471 


3257 


5043 


6829 


784CIP2C 60 


3551 


1472 


3258 


5044 


6830 


784CIP2C 61 


3553 


1473 


3259 


5045 


6831 


784CIP2C 6i 


3564 


1474 


3260 


5046 


£832 


784CIP2C 63 


" "3567 


1475 


3261 


5047 


6833 


784CIP2C 64 


3572 


1476 


3262 


504 8 


6834 


784CIP2C 65 


3573 


1477 


3263 


5049 


6835 


784CIP2C 66 


3574 


1478 


3264 


5050 


6836 


784CIP2C_67 


3583 


1479 


3265 


5051 


6837 


784CIP2C 68 


3615 


1480 


3266 


5052 


6838 


784CIP2C_69 


3623 


1481 


3267 


5053 


6839 


784CIP2C 70 


3629 


1482 


3268 


5054 


6840 


784CIP2C 71 


3666 


1483 


3269 


5055 


6841 


784CIP2C_72 


3667 


1484 


3270 


5056 


6842 


784CIP2C 73 


3906 


1485 


3271 


5057 


6843 


784CIP2C 74 


3912 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority- 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number_ 


NO:in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority- 






sequence 






application 




1486 


3272 


5058 


6844 


784CIP2C_75 


3924 


1487 


3273 


5059 


6845 


784CIP2C 76 


3928 


1488 


3274 


5060 


6846 


7S4CIF2C 77 


3935 


1489 


3275 


5061 


6847 


784CIP2C_78 


3959 


1490 


3276 


5062 


6848 


784CIP2C_79 


3981 


1491 


3277 


5063 


6849 


784CIP2C 80 


3989 


1492 


3278 


5064 


6850 


784CIP2C 81 


4295 


1493 


3279 


5065 


6851 


784CIP2C_82 


4300 


1494 


3280 


5066 


6852 


784CIP2C_B3 


4360 


1495 


3281 


5067 


6853 


rift J nTT%r\ /-* r*\ m 

7B4CIP2C_84 


4362 


14 96 


3282 


5068 


6854 


784CIP2C_85 


4371 


• 1497 


3283 


5069 


6855 


784CIP2C_86 


4373 


1498 


3284 


5070 


6856 


784CIP2C_87 


4376 


1499 


3285 


5071 


6857 


784CIP2C_89 


4378 


1500 


3286 


5072 


6858 


784CIP2C_90 


4382 


1501 


3287 


5073 


6859 


784CIP2C_91 


4409 


• 1502 


3288 


5074 


6860 


784CIP2C_92 


4421 


1503 


3289 


5075 


6861 


784CIP2C_93 


4421 


1504 


3290 


5076 


6862 


7B4CIP2C_94 


4426 


1505 


3291 


5077 


6863 


784CIP2C_95 


4430 


1506 


3292 


5078 


6864 


784CIP2C_96 


4435 


1507 


3293 


5079 


6865 


784CIP2C_97 


4436 


1508 


3294 


5080 


6866 


784CIP2C_98 


4439 


1509 


3295 


5081 


6867 


784CIP2C_99 


4440 


1510 


3296 


5082 


6868 


784CIP2C_100 


4441 


1511 


3297 


5083 


6869 


784CIP2C_101 


4442 


1512 


3298 


5084 


6870 


784CIP2C_102 


4455 


1513 


3299 


5085 


6871 


784CIP2C 103 


4462 


1514 


3300 


5086 


6872 


784CIP2C_104 


4466 


1515 


3301 


5087 


6873 


784CIP2CJL05 


4469 


1516 


3302 


5088 


6874 


784CIP2C_106 


4477 


1517 


3303 


5089 


6875 


7 8 4 CI P2C_1 0 7 


4481 


1518 


3304 


5090 


6876 


784CIP2C_10 8 


4483 


1519 


3305 


5091 


6877 


784CIP2C_109 


4484 


1520 


3306 


5092 


6878 


784CIP2C_110 


4486 


1521 


3307 


5093 


6879 


784CIP2C_111 


4490 


1522 


3308 


5094 


6880 


784CTP2C__112 


4499 


1523 


3309 


5095 


6881 


784CIP2C_113 


4503 [ 


1524 


3310 


5095 


6882 


784CIP2C_114 


4506 


1525 


3311 


5097 


6883 


784CIP2C_115 


4509 


1526 


3312 


5098 


6884 


784CTP2C__116 


4514 


1527 


3313 


5099 


6885 


784CIP2C_117 


4516 


1528 


3314 


5100 


6886 


^i ft ji /iT r>«"i lift 

784CIP2C_118 


4522 


1529 


3315 


5101 


68B7 


784CIP2C — 119 


4525 


1530 


3316 


5102 


6888 


784CIP2C 120 


4527 


1531 


3317 


5103 


68B9 


784CIP2C_121 


4528 


1532 


3318 


5104 


6890 


784CIP2C_122 


4529 


1533 


3319 


5105 


6891 


784CIP2C_123 


4532 


1534 


3320 


5106 


6892 


A AT ^ /"l -1 ft A 

784CIP2C_124 


4537 


1535 


3321 


5107 


6893 


784CIP2C_125 


4538 


1536 


3322 


5103 


6894 


784CIP2C_126 


4551 


1537 


3323 


5109 


6895 


784CIP2C_127 


4552 


1538 


3324 


5110 


6896 


/a4C±£>^L__12o 


4559 


1539 


3325 


5111 


6897 


784CIP2C_129 


4567 


1540 


3326 


5112 


6898 


784CIP2C_130 


4568 


1541 


3327 


5113 


6899 


784CIP2C_132 


4585 


1542 


3328 


5114 


6900 


784CIP2C_133 


4592 


1543 


3329 


5115 


6901 


784CIP2C_134 


4609 


1544 


3330 


5116 


6902 


784CIP2C_135 


4616 


1545 


3331 


5117 


6903 


784CIP2C_136 


4617 


1546 


*332 


5118 


6904 


784CIP2C 137 


4618 


1547 


3333 


5119 


6905 


784CIP2C_138 


4620 
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bh,y ID NO: 


SEQ ID 


SEQ ID NO : 


SEQ ID 


Priority 


SEQ ID 


nF -Fill 1 - 
OL lull' 


MP. « ^-C 

JNU : or 


o£ con tig 


Kin • 


docket number_ 


NO : in 


lsn3t.il 


full- 




<ji contig 


cor re sponcLing 


U. S . S . N . 


nuclpot* i Hp* 






pc^tiue 


■3JEi^ XLl lSi\J 1 in 


no //QQ TIC 


sequence 






sequence 


nr i ot") 






sequence 






aDDl i cation 




1548 


3334 


5120 


6906 


784CIP2C 139 


4624 


1549 


3335 


5121 


6907 


784CIP2C 140 


4632 


1550 


3336 


5122 


6908 


784CIP2C 141 


4634 


1551 


3337 


5123 


6909 


784CIP7P 149 


ACID 
403 O 


1552 


3338 


5124 


6910 


7fi4f'TP9P 141 

/Or±l«lirAV~ 143 


A C "I Q 
4033 


1553 


3339 


5125 


6911 


784CIPOP 144 


4©43 


1554 


3340 


£l2£ 


£912 


784CIP2C 145 

/ o^vir^u 143 


AC A A 
4044 


1555 


3341 


5127 


6913 


7B4PrP9P 14fi 

' OtIUIFA^ X4D 


4£SS 
4033 


1556 


3342 


5128 


6914 


7R4PTP0P 147 
/ O 4 w ± Jtr A v.. 14 / 


ACCft 
40 O O 


1557 


3343 


5129 


6915 


7Q4CTP2r 14P, 


A £-77 
40 / / 


1558 


3344 


5130 


6916 


7R4ptpop 1AQ 

/ O^Llr^L 147 


AC 77 


1559 


3345 


. 5131 


6917 


7ftdPTP0P icn 
/ O^tlrzL 13v 


4677 


1560 


3346 


1 5132 


6918 


TflifTD^P 1 SO 


A C Q O 
4 O OA 


1561 


3347 


5133 


6919 


(O^v-lr/L 133 


A C Q f\ 
4 O 3U 


1562 


33 48 


5134 


6920 


* O V-lJc'AL- 134 


4691 


1563 


33 49 


5135 


6921 


/OTtUXirA^ 133 


A OO O 
4 / Z 1 


1564 


3350 


5136 


6922 


7fldPTP9P 1 Sfi 


A 7 1 Pi 
4 / 3 U 


1565 


33 51 


513 7 


07A3 


711/lPTDOP 1 C7 
/OflUlirAV- 13/ 


4734 


i566 


33 52 


513 8 




7QAPTDOP 1 CQ 


4 /3 / 


1567 


3353 


SlJJ' 


O 7*S3 


7QAPTDOP ICQ 
/o4UXr'^L. 137 


4764 


1568 


33 54 


5140 




* OftUlr^V. IOU 


4786 


1569 


3355 


SI 4 1 
3 14 1 




TQAPTDTP 1 CI 
/04Lli'^L_lOl 


4793 


1570 


3356 


5142 


6928 


TftdPTOP ICO 


A ft 1 C 
40 A3 


1571 


3357 


5143 




/ OtLlr^t 103 


A ft O C 

40 zo 


1572 


3358 


5144 


6930 


7RdPT"DOP 1 Cd 
/ OH^JLrZL 104 


a ft c n 

40 30 


1573 


3359 


5145 


6.QH 
O 731 


"7ftAPTD*)r 1 CQ 
/O^UlJr^tl- 103 


4853 


1574 


3360 


5146 


KQ"*9 


7DAPTDOP 1 CC 
>o%tli'2t loo 


4855 


1575 


3361 


5147 


O 733 


7QAPTDOP "1 C7 
/o^txr^t lb/ 


4856 


1576 


3362 


5148 


CQ^A 


7QAPTDOP 1 CP 
/ 04v»l±r*U IOO 


A Q C T 

4 a b / 


1577 


3363 


ox**? 


D Jj 3 


7ft APT OOP ICQ 
/ OSLlr^L 10 7 


4869 


1578 


3364 


5150 


O 730 


*7 ft APTT30P 1 OPi 
/ D4LlrZL X/U 


a a 0 a 
4 a /a 


1579 


33 65 


5151 


O 73 / 


7 ft APT OOP l 71 
/ 04l~l.tr Al- 1/1 


4880 


1580 


3366 


5152 


D 73 O 


7RAPTP9P 175 


AQAO 


1581 


33 67 


si s - * 

3133 


0337 


TBAPTDOf 1 1*71 

/ o4L.lJrZv-^i / 3 


4945 [ 


1582 


33 68 


si 

3134 


c Qdn 


/04t-ilr^U 1 /4 


4950 


1583 


3369 


si 

3X33 


0311 


70APTDOP 1 7C 
/ o4LlrzL 1 / 3 


A ocn 

473A 


1584 


3370 


5156 


£QdO 


7RAPTD9P 17C 
/ 0*i L.X Jr I/O 


4 954 


1585 


3371 


5157 


D743 


7AAPTDOP 177 


4958 


1586 


3372 


5158 


07 *4 


7BAPTDOP 17ft 
/ OH I. If* 42V- 1 / 0 


4961 | 


1587 


33 73 


5159 


rode 


TQAPTBOP 17Q 
/D4Ulr/L 1/7 


337U 


1*88 


33 74 


51^0 


6 946 


7RAPTDOP 1 ft P* 


CC QQ ~ "i 


1589 


3375 '< 


5161 


074 / 


7 ft A PTDO P Iftl 


303 A 


1590 


3376 


si so 


C a a a 
07r*0 


7I1APTDOP T ftO 
/o^Ll r/ V-^l 0 ^ 


3 /3A 


1591 


33 77 


5163 




7 ft A PT DO P 1 HI 


3 /DO 


1592 


3378 


5*1^4 


£Qsn " 

0 7 3U 


/04^.Xlr<6V. 104 


S771 
3 / /I 


1593 


3379 


5165 


Q731 


7fldPTD9P 1 AS 
/04^>lcr4b^ 103 


S77A 
3 / /4 


1594 


3380 


5166 


6952 


7R4PTP0P 1 ftfi 
/O^tV^Xlr^^ IOO 


3 / 73 


1595 


3381 


5167 


6953 ' 


7R4PTP0P 1 H7 
/OftUlr/L 10/ 


3D UD 


159* 


3382 


51 £1 

3 x 0 0 


0734 


7ft"dPfP0P 1 fl'ft 
/O^LlfZL IOO 


3 o3A 


1597 


3383 


5169" 
3107 


0733 


7ftdPTDOP 1 HQ 


co on 
3Q7A 


1598 


3384 


3 x / u 


0730 


TQ/PTDOP ion 
/04Ulf *L._X3ll 


cncn 


1599 


3385 


SI 71 
31/1 


033 / 


TDAPTDOP 1 Q1 
/ 0 H L.1 irZU 171 


cnci 
0 UO JL 


1600' 


3386 


5172 


6958 


784CIP2C 192 


6109 


1601 


3387 


5173 


6959 


784CIP2C_193 


6160 


1602 


3388 


5174 


6960 


784CIP2C_194 


6297 


1603 


3389 


5175 


6961 


784CIP2C_195 


6398 


1604 


3390 


5176 


6962 


784CIP2C_196 


6^98 


1605 


3391 


5177 


6963 


784CIP2C_197 


6415 


1606 


3392 


5178 


6964 


784CIP2C_198 


6448 


1607 


3393 


5179 


6965 


784CIP2C_199 


6469 


1608 


3394 


5180 


6966 


784CIP2C_200 


6'476' 


1609 


3395 1 


5181 


6967 


784CIP2C_201 


6561 
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SEQ ID NO: 
of full- 
length 
iiuv-icuLiue 
sequence 


SEQ ID 
NO: Of 
full- 
length 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 
peptide 


Priority 
docket number_ 
corresponding 
oay 1U NO: in 
prior i 


SEQ ID 
NO: in 
U.S .S .N. 

nn / inn «f*\r- 

09/48 8 , 725 


1610 


33 96 


5182 


6968 


7R4PTP^>P ono 


0 O / fx 


1611 


3397 


5183 


6969 


7R4PTP9P 


6578 


1612 


3398 


" " 5184 


6970 


784PIPPP 904 


erf') 


1613 


3399 


5185 


6971 


784PTP7P 7f)R 


fiK77 


1614 


3400 


5186 


6972 


784PTPPP 70fi 




1615 


3401 


5187 


6973 


784PTP2P 7f)7 


CCQC 
O O 


1616 


3402 


5188 


" 69*74 


7S4PTP7P ono 


K7AC 


1617 


3403 


5189 


6975 


784PTP7P 50Q 


K ROfl 
O DJO 


1618 


3404 


5190 


6976 


784PTP7P 9i n 


o ? J 0 


1619 


3405 


5191 


6977 


7R4PTP7P 71 1 

/ Oft l~J. tr £, X X 


CQ41 


1620 


3406 


5192 


6978 


7A4PTP0P 710 


7i i n 


1621 


3407 


5193 




7A4PTPOP 01 
/ Oft L,Xir*2l- dXJ 


7 o nn 


1622 




5194 


D70U 


IR^OTDir" O 1 A 

/ofl^Xr^L. 4J.ft 


7212 


f 1623 




Cl QC 
OX J o 


O 7 0 i. 


/o<kK.LirZ{. zlb 


7218 


1624 


3 410 


5196 




/ Oft Jcr^l. <4JLb 




1625 


3411 


Cl ft*? 


D70 J 


IDiPTDir 1 11 1 

to 4 k\~XL J d\- 41 I 


7500 


1626 


3412 


5198 


£QR4 


OOjI f'T DO/"' ono 


7509 


1627 


3413 


5199 


o yob 


TOyi nrmn — 1-10 

784CIP2C 219 


7523 


1628 




-3 4. U U 


£QOC 

070Q 


/o4dP2C_220 


7544 


1629 




CO ni 


6987 


784CIr2C_221 


7564 


1630 


1/11 C 


com 


6988 


784CIP2C_222 


7568 


1631 


J ffc J. / 


jZ Uj 


698 9 


/o4CIJb l 2C_223 


7631 


1632 




3^ Uft 


6990 


/o4Cli'2C_224 


7813 


1633 


J fkJ.9 


cone 


6991 


/84CJ.r2L 22b 


7831 


1634 


O fl £ U 


cone 


6992 




7843 


1635 




5207 


6993 


/o4CIP2C_227 


7907 


163 6 




CO f\ ft 


6994 


TO/1 i^»T" nnn ^ i n 

/o4uIF2L_22 B 


7943 


1637 


3 423 


con q 


CQQC 




8175 


1638 


3424 


cot n 


b 


/o4L.J.Jr2C 230 


8216 


1639 


3425 


5211 


037 / 


/OfH — Lir^L 2 J 1 


8225 


1640 


3426 


5212 


6998 


IRflPTDOP OIO 


8271 


1641 


3427 


5213 


6999 


ORAPTDOH OT1 


8397 


1642 


3428 


5214 


7000 


7fl4PTDOP Tl/ 


O A cc 
ofloo 


1643 


3429 


5215 


on n i 

/ \J\) X 


TflAPTTJOP Tic 
/OftL.±ir^C * Jb 


8503 


1644 


3 430 


5216 


7002 


Ifl/IPTDOP tic 
/ Oftl-i. Jr £. L. £jO 


8 953 


1645 


3 431 


5217 




Tfl/IPTDOP 0*30 
1 G'kK^XX* J / 




1646 


3432 


coin 


/ U U 4 




913 9 


1647 


3433 


C91 Q 




TflAPTDOP OO Q 


9555 


1548 


3 434 


5220 


7006 




9650 


1649 


3435 


5221 


7007 


TO/ipT nnp OA1 
/ Oft l*J.Jr£l« ^£ftX 


QQQQ 

yob y 


1650" " 


3436* 


5222 


7008 




9933 


1651 


3 43 7 


5223 


on n o 


/ofiUXir^U 2 ft J 


9953 


1652 


343 8 


5224 


om n 


7QAPTDOP ^AA 
/ Ofi^li^zL. ^fifl 


9981 


1653 


3439 


5225 


7(11 i 
run 


70A<^TDOn 1 
/OftUlfZJJ J. 


/flo 


1654 


3440 


5226 


7012 


7 h 4 p"t p o~n~5 


Jbbo 


1655 


3441 


5227 


om 1 

f U JL J 




3558 


1656 


3442 


5228 




/ o ft V- X F^l/^ft 


t £ 1 1 


1657 


3443 


5229 






jbbo 


1658 


3444 


523 0 


om c 

/ U JL O 




J /32 


1659 


3445 


5231 


7/1 1 7 
f UX f 




4004 


1660 


3446 


CO-IT 


om o 
/uio 




4700 


1661 


3 44 7 




om q 




4703 


| 1662 


3448 


$234 


7020 


784CIP2D 10 


4774 


j 1663 


3449 


5235 


7021 


784CIP2D 11 


4894 


j 1664 


3450 


• 5236 


7022 


784CIP2D_12 


4918 


1665 


3451 


5237 


7023 


784CIP2D 13 


5159 


1666 


3452 


5238 


7024 


784CIP2D__14 


7443 


1667 


3453 


5239 


7025 


784CIP2D_15 


8673 


1668 


3454 


5240 


7026 


784CIP2D_16 


8679 


1669 


3455 


5241 


7027 


784CIP2D_17 | 


8727 


1670 


3456 


5242 


7028 


784CIP2D 18 


8734 


1671 


3457 


5243 


7029 


784CIP2D 19 


8756 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priori ty 
docket number 
corresponding 
SEQ ID NO: in 
priority- 
application 


SCAj XU 

NO : in 
U.S .S .N. 
09 /488, 725 


1672 


3458 


5244 


7030 


784CIP2D_20 


8818 


1673 


3459 


5245 


7031 


784CIP2D_21 


8644 


1674 


3460 


5246 


7032 


784CIP2D_22 


8846 


1675 


3461 


524 7 


7033 


784CIP2D_23 


8912 


1676 


3462 


5248 


7034 


784CIP2D 24 


8918 


1677 


3463 


I 5249 


7035 


1 784CIP2D_25 


8918 


1678 


3464 


1 5250 


7036 


784CIP2D 26 


8941 


1679 


3465 


5251 


7037 


784CIP2D_27 


8941 


1660 


3466 


5252 


7038 


784CIP2D_28 


8951 


1681 


3467 


5253 


7039 


784CIP2D_29 


8951 


1682 


3468 


5254 


7040 


784CIP2D 30 


9007 


1683 


3469 


5255 


7041 


784CIP2D 31 


9012 


1684 


3470 


5256 


7042 


784CIP2DJ32 


9013 


1685 


3471 


5257 


7043 


784CIP2D_33 


9025 


1686 


3472 


5258 


7044 


784CIP2D 34 


9053 


1687 


3473 


5259 


7045 


784CIP2D 35 


9054 


1688 


3474 


52*0 


7046 


784CIP2D_36 


9054 


j 1689 


3475 


5261 


7047 


784CIP2D_37 


9113 


1690 


3476 


5262 


7048 


784CIP2D_38 


9134 


1691 


3477 


5263 


7049 


784CIP2D 39 


9152 


1692 


3478 


5264 


7050 


784CIP2D 40 


9152 


1693 


3479 


5265 


7051 


784CIP2D_41 


9211 


1694 


3480 


5266 


7052 


784CIP2D_42 


9223 


1695 


3481 


5267 


7053 


784CIP2D_43 


9223 


1696 


3482. 


5268 


7054 


784CIP2D_44 


9231 


1697 


3483 


5269 


7055 


784CIP2D_45 


9236 


1698 


3484 


5270 


7056 


784CIP2D_46 


9236 


1699 


3485 


5271 


7057 


784CIP2D_47 


9303 J 


1700 


3486 


5272 


7058 


784CIP2D_48 


9309 


1701 


3487 


5273 


7059 


784CIP2D_49 


9314 


1702 


3488 


5274 


7060 


784CIP2D 50 


9326 


1703 


34B9 


5275 


7061 


784CIP2D_51 


9339 


1704 


3490 


5276 


7062 


784CIP2D_52 


9346 


1705 


3491 


5277 


7063 


784CIP2D_53 


9376 


1706 


3492 


5278 


7064 


784CIP2D_54 


9382 


1707 


3493 


5279 


7065 


784CIP2D__55 


9407 


1708 


3494 


5280 


7066 


784CIP2D_56 


9414 


1709 


3495 | 


. 5281 


7067 


784CIP2D 57 


9439 


1710 


3496 


5282 


7068 


784CIP2D 58 


9485 


1711 


3497 


5283 


7069 


784CIP2D_59 


9493 


1712 


3498 


5284 


7070 


784CIP2D_60 


9501 


1713 


3499 


5285 


7071 


784CIP2D_61 


$526 


1714 


3500 


5286 


7072 


784CIP2D_62 


9526 


1715 


3501 


5287 


7073 


784CIP2D_63 


9551 


1716 


3502 


5288 


7074 


784CIP2D_64 


9557 


1717 


3503 


5289 


7075 


784CIP2D_65 


9568 


1718 


3504 


5290 


7076 


784CIP2D 66 


9588 


1719 


3505 


5291 


7077 


784CIP2D_67 


9597 


1720 


3506 


5292 


7078 


784CIP2D_68 


9615 


1721 


3507 


5293 


7079 


784CIP2D_69 | 


' 9626 


1722 


3508 


5294 


7080 


784CIP2D_70 


9649 


1723 


3509 


5295 


7081 


784CIP2D_71 


9652 


1724 


3510 


5296 


7082 


784CIP2D_72 


9660 


1725 


. 3511 


5297 


7083 I 


784CIP2DJ73 


9662 


1726 


3512 


529B 


7084 


784CIP2D 74 


9725 


1727 


3513 


5299 


7085 


784CIP2DJ75 


9746 


1728 


3514 


5300 


7086 


784CIP2D_76 


9777 


1729 


3515 


5301 


7087 


784CIP2D_77 


9787 


1730 


3516 


5302 


7088 


784CIP2D_78 


9790 


1731 


3^17 


5303 


7089 


784CIP2D_79 


9842 


1732 


3518 


5304 


7090 


784CIP2D 80 


9842 


1733 


3519 


5305 


7091 


784CIP2D 81 


9848 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


— J- iui x u y 

docke t number 
corre sponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO : in 
U. S.S.N. 
09/488, 725 


1734 


3520 


5306 


7092 


784CIP2D 82 


9867 


1735 


3521 


5307 


7093 


784CIP2D 83 


, 10010 


1736 


3522 


5308 


7094 


784CIP2D_84 


10011 


1737 


3523 


5309 


7095 


784CIP2D 85 


10052 


1738 


3524 


5310 


7096 


784CIP2D_86 


10057 


1739 


3525 


5311 


7097 


784CIP2D 87 


10085 


1740 


3526 


5312 


7098 


784CIP2D 89 


10139 


1741 


3527 


5313 


| 7099 


784CIP2D_90 


10142 


1742 


3528 


5314 


7100 


784CIP2D 92 


10165 


1743 


3529 


5315 


7101 


784CIP2D 93 


10173 


1744 


3530 


5316 


7102 


784CIP2D 94 


10173 


1745 


3531 


5317 


7103 


784CIP2D 95 


10273 


1746 


3532 


5318 


7104 


784CIP2E 1 


3121 


1747 


3533 


5319 


7105 


704CIP2E 2 


3628 


1748 


3534 


5320 


7106 


784CIP2E_4 


3673 


1749 


3535 


5321 


7107 


784CIP2E 5 


4018 


1750 


3536 


5322 


7108 


" 784CIP2E 6 


4467 


1751 


3537 


5323 


7109 


784CIP2E 7 


4865 


1752 


3538 


5324 


7110 


784CIP2E 8 


4916 


i 1753 


3539 


5325 


7111 


784CIP2E 9 


4923 


1754 


3540 


5326 


7112 


784CIP2E 10 


4926 


1755 


3541 


5327 


7113 


784CIP2E_11 


4962 


1756 


3542 


5328 


7114 


784CIP2E_JL2 


4963 


1757 


3543 


5329 


7115 


784CIP2E__13 


4964 


1758 


3544 


5330 


7116 


784CIP2E 14 


4988 


1759 


3545 


5331 


7117 


7B4CIP2E_15 


5835 


17^0 


3546 


5332 


7118 


784CrP2E 16 


7682 


1761 


3547 


5333 


7119 


784CIP2E_17 


7682 


1762 


3548 


5334 


7120 


784CIP2E 18 


7699 


1763 


3549 


5335 


7121 


784CIP2E 19 


7707 


1764 


3550 


5336 


7122 


784CIP2E 20 


7707 


1765 


3551 


5337 


7123 


784CIP2E 21 


7752 


1766 


3552 


5338 


7124 


784CIP2E 22 


8357 


1767 


3553 


5339 


7125 


784CIP2E 23 " 


9065 


1768 


3554 


5340 


7126 


784CIP2E 24 


9324 


1769 


3555 


5341 


7127 


784CIP2F 1 


2976 


1770 


3556 


5342 


7128 


784CIP2F_2 


3559 


1771 


3557 


5343 


7129 


784CIP2F 3 


4021 


1772 


3558 


5344 


7130 


784CIP2F 4 


4474 


1773 


3559 


5345 


7131 


784CIP2F 5 


4566 


1774 


3560 


5346 


7132 


784CIP2F_6 


4705 


1775 


3561 


5347 


7133 


784CIP2F 7 


4707 


1776 


3562 


5348 j 


7134 


784CIP2F 8 


4712 


1777 


3563 


5349 


7135 


784CIP2F 9 


5008 


1778 


3564 


5350 


7136 


784CIP2F 10 


5009 


1779 


3565 


5351 


7137 | 


784CIP2F 11 


5015 


1780 


3566 


5352 


7138 


7B4CIP2F 12 


5015 


1781 


3567 


5353 


7139 


784CIP2F_13 


7724 


1782 


3568 


5354 


7140 


784CIP2F_14 


7725 


1783 


3569 


5355 


7141 


784CIP2F 15 


8828 


1784 


3570 


5356 


7142 


784CIP2F 16 


8830 


1785 


3571 


5357 


7143 


784CIP2F 17 


9739 


178* 


3572 


5358 


7144 


784CIP2F 18 


9896 



TRADOCS: 14 16247.1 (%CS70 1 1.DOC) 
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TABLE 7 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K«Lysine, 
L»Leucine; M»Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5359 " 


337 


1131 


AH L S ARLS AL I LD EVAI L P APQNLS VLS TNMKHLLMWS P V I APG 
ETVYYSVEYQGEYESLYTSHIWIPSSWCSLTEGPECDVTDDITA 
TVPYNLRVRATLGSG/TS/CLEHP/VSIPLIETQPSLPDL/RMEI 
TKDGFHLVIELEDI^PQFEFLVAYWRREPGAEEHVWviVRSGGlP 
VHLETMEPGAAYC VKAQTFVKA IGRYSAFSQTE CVEVQGEAI PL 
VLALFAFVGFML I LVWPLF VWKMGRLLQ / YLLLPRGGS SQTPW 
KITQF 


5360 


2 


1115 


PR VRS SGGQE D PAS QQ WAR P RFTQ PS KMRRRVI AR P VGS S VRLK 
CVASGHPRPDITWMKDDQALTRPEAAEPRKXKWTLSLKNLRPED 
SGKYTCRVSNRAGAINATYKVDVIQRTRSKPVLTGTHPVNTTVD 
FGGTTSFQCKVRSDVKPVIQWLKRVEYGAEGRHNSTIDVGGQKF 
WLPTGDVWSRPDGSYLNKLLITRARQDDAGMY I CLGANTMGYS 
FRSAFLTVLPDPKPPGPPVASSSSATSLPWPWIGIPAGAVFIL 
GTLLLWLCQAQKKPCTPAPAPPLPGHRPPGTARDRSGDKDLPSL 
AALS AGPGVGLCEEHGSPAAPQHLLG PGPVAGP KLYPKLYTGHS 
TPHTYTHPPPSCQLNSSHS 


5361 


3 


925 


HEGSISSANILLDDQFQPKLTDFAMAHFRSHLEHQSCTINMTSS 
SSKHLWYMPEEYIRQGKLSIKTDVYSFGIVIMEVLTGCRWIjDD 
PKHIQLRDLLREIMEKRGLDSCLSFLDKKVPPCPRNFSAKLFCL 
AGRCAATRAKLR PSMDEVLNTLESTQAS LYFAEDPPTSLKS FRC 
PSPLFLENVPS I PVEDDESQNNNLLPSDEGLRIDRMTQKTPFEC 
SQSEVMFLSLDKKPESKRNEEACNMPSSSCEESWFPKYIVPSQD 
LRPYKVNIDPSSEAPGHSCRSRPVESSCSSKFSWDEYEQYKKE 


5362 


2 


4879 


SCQVEGCTRTYNSSQSIGKHMKTAHPDQYAAFKMQRKSKKGQKA 
NNLNTPNNGKFVYFLPSPVNS SNPFFTS QTKANGNPACSAQLQH 
VSP P I F P AHLAS VS T PLLS S M E S V I NPN I TSQDKNEQGGMLCS Q 
MENLPSTALPAQMEDLTKTVLPLNIDRGSDPFLSLPAESSSIDL 
FPS PADSGTNSVFSQLENNTNHYSSQ IEGNTNS SFLKGGNGENA 
VFPSQVNVANNFSSTNAQQSAPEKVKKDRGRGQTGKERKPKHNK 
RAKWPAIIRDGKFICSRCYRAFTNPRSLGGHLSKRSYCKPLDGA 
EIAQELLQSNGQPSLLASMILSTNAVNLQQPQQSTFNPEACFKD 
PSFLQLLAENRS PAFLPNTFPRSGVTNFNTSVSQEGSEI I IQAL 
ETAG IPS T FEGAEM LS HVS TG CVS D ASQ VNATVM PNP TVP P LLH 
TVCHPNTLLTNQNRTSNS KTSS IEECSSLPVFPTNDLLLKTVEN 
GLCSS S FPNSGG PSQNFTSNS SRVS VISGPQNTRSSHLNKKGNS 
AS KRRKKVAPPL I APNASQNLVTSDLTTMGLIAKSVB I PTTNLH 
SNVIPTCEPQSLVENLTQKLNNVNNQLFMTDVKENFKTSLESHT 
VLAPLTLKTENGDS QMMALNS CTTSVNS DLQI S EDNVI QNFE KT 
LEIIKTAMNSQILEVKSGSQGAGETSQNAQINYNIQLPSVNTVQ 
NNKLPDSSP\FSSFISVMPTESNIPQSE\VSHKBDQIQEILEGL 
QKLKLENDIiSTPASQCVLINTSVTLTPTPVKSTADITVIQPVSE 
M INIQFNDKVNKPF VCQNQGCNY SAMTKDAL FKHYGK IHQYTPE 
M I LEI KKNQLKFAP F KC WPTCT KTFTRNS NLRAHCQLVHH FTT 
EEMVKLKIKRPYGRKSQSENVPASRSTQVKKQLAMTEENKKESQ 
PALELRAETQNTHS NVAVI PEKQL IEKKS PD KTES S LQ VI TVTS 
EQCNTNALTNTQTKGRKIRRHKKEKEEKKRKKPVSQSLEFPTRY 
S P YRP YRCVHQGCFAAFTI QQNL I LH YQAVHKS DL P AFS AE VEE 
ESEAGKESEETETKQTLKEFRCQVSDCSRIFQAITGLIQHYMKL 
HEMTPEEIESMTASVDVGKFPCDQI^CKSSFTTYLNYVVHLEAD 
HGIGLRASKTEEDGVYKCDCEGCDRIYATRSNLLRHIFNKHNDK 
HKAHLIRPRRLTPGQENMSSKANQEKSKSKHRGTKHSRCGKEGI 
KMPKTKRKKKNNLENKNAKIVQIEENKPYSLKRGKHVYSIKARN 
DALSECTSRFVTQYPCMIKGCTSWTSESNIIRHYKCHKLSKAF 
TSQHRNLLIVFKRCCNSQVKETSEQEGAKNDVKDSDTCVSESND 
NSRTTAT VS Q KE VE KNE * DEMDE LTEL F ITKL I NEDS TS VETQA 
NTSSNVSNDFQEDNLCQSERQKASNLKRVNKEKNVSQMiCKRKVE 
KAEPASAAELSSVRKEEETAVAIQTIEEHPASFDWSSFXPMGFE 
VS FLKFLEESAVKQKKNTDKDHPNTGNKKGSHSNSRKNIDKTAV 
TSGNHVCPCKESETFVQFANPSQLQCSDNVKIVLDKNLKDCTEL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=»Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine , N=Asparagine, 
P=Proline, Q=Glutamine f R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








VLKQLQEMKPTVSLKKLEVHSNDPDMSVMKDISIGKATGRGQY 


5363 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PP S WRRQ P P GG IRRDFS RRLRRE ANL VATCLPVRAS LPHRLNML 
RG PG PGLLLLAVLCLGTAVPS TG AS KS KRQAQQMVQPQS P VAVS 
QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYVVGETWEKPYQGWMMVDCTCLGEGSGR 
I TCTS RNRCNDQDTRTS YRIGDTWS KKDNRGNLLQ C I CTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSG WYS VGMQLA* KTQGNKQML \CTCLGNGVSCQETAVTQTYG 
GNSNGEP CVLP FT YNGRT FYS CTTEGRQDGHLW CSTTS N YEQDQ 
KYS FCTDHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTS EGRR 
DNMKW CGTTQNYDADQKFGFC PMAAHEE I CTTNEGVMYR I GDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNSYTIKGLKPGWYEGQLISIQQYGHQEVTRFDFTTTSTST 
PVTSNT\VTGETTPFSPLVATSESWEITASSFWSWVSASDTV 
SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 
VYQISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSIWRWSR 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
I T I YAVE ENQ ESTP W I QQETTGTP RSDTVPS PRDLQ F VE VTDV 
KVTIMWTPPESAVTGYRVDVIPVNLPGEHGQRLPLSRNTF\AEN 
TGLS PGVT Y Y FKVFAVSHGRES KPLTAQQTTKL\DAPTNLQFVN 
ET D S TVL VRW T P PRAQ I TG YRLT VGLTRRGQ PRQ YNVG PS VS K Y 
PLRNLQPASEYTVSLVAIKGNQESPKATGVFTTLQPGSSIPPYN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVE YVYT I QVLRDGQERDAP \ I VNK\ WTPLS PPTNLH 
LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEW 
HADQS SCTF \ DNLEVPGLE YNVS VYTVKDDKES VP I SDTI I PAV 
PPPTDLRFTN/ILGPDTMRVTW\APPPSIDLTNFLVRYSPVKNE 
GRMLQSLSIFFLSDN\AWLTNLLPGTEYWSVSSVYEQHESTP 
\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNS3TLTNLTPGTEYW 
S I VALNGREES PLLIGQQSTVSDVPRDLEWAATPTSLLI \ SWD 
APAVTVRY YR ITYGETGGNSPVQEFTVPGS KSTATI SGLKPGVD 
YTITVYAVTGRGDSPAS SKPIS INYRTEIDKPSQMQVTDVQDNS 
ISVKWLPSSSPVTCYRVTTT\PKNGPG\PTKTKTAGPDQTEMTI 
EGLQ PTVE YWS VYAQNPSGESQ PL VQTAVTNTI DRFKGLAFTD V 
DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGIjRPGS e ytvs walhddme S Q PL IGTQSTAI paptd LKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
SVVVSGLMVATKYEVSVYALKDTLTSRPAQGVVTTLENVSPPRR 
ARVTDATETTITISWRTKTETITGFQVDAVPANGQTPIQRTIKP 
D VRS YTI TG LQPGTD YKI YLYTLNDNARS S P W I DAS TA I DAPS 
NLRFLATT PNS LLVS WQ P PRAR I TG Y IIKYEKPGSP PRE WPRP 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTISWAPFQDTSEYIISCHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGATYNI I VEALKDQQRHKVREEWTVGNS VNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 
ADREDSRE 


5364 


8066 


703 


RLCCTGGG EGTPGASGKRGP AATTS LVLC I PS VP PP VP FPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location * 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»AGparagine, 
P»Proline, Q«Glutamine, R«Arginine, 
S»Serine, T«Threonine, V:=Valine, 
W=Tryptophan, Y=Tyroaine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QSKPGCYDNGKHYQINQQWERTYLGNALVCTCy(3GSRGFNCESK 
PEAEETCFDKYTC3NTYRVGDTYERPKDSMI WDCTCIGAGRGR IS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
C KP I AEKC FDHAAGTS YWGETWE KP YQGWMMVDCT CLGEGSGR 
I TCTSRNRCNDQDTRTS YRIGDTWSKKDNRGNLLQC I CTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYS VGMQLA* KTQGNKQML\ CTCLGNG VS CQETAVTQTYG 
GNSNG EP CVIi P FTYNGRT FYS CTTEGRQDGHLWCS TTSNYEQDQ 
KYSFCTDHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 
DNMKWCGTTQNYDADQKFGFCPMAAHEEICTTNEGVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCI AYSQLRDQC I VDD I TYKVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GD S WE KYVHGVRYQC YC YGRG IGE WHCQPLQTYP S S SG P VE VF I 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNSYTIKGLKPGWYEGQLISIQQYGHQEVTRFDFTTTSTST 
PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 
SGFRVEYEI>SEEGDEPQYLV1jPSTATSV\NIP\DIjLPGRKYIYN 

vyqisedgbqslilstsqttapdappdptvdqvddtsiwrwsr 
pqapitgyrivyspsvegsstelnlpetansvtlsdlqpgvqyn 
itiyaveenqestpwiqqettgtprsdtvpsprdlqfvevtdv 
kvtimwtppesavtgyrvdvtpwlpgehgqrlplsrntfVaen 

TGLS PGVTYYFKVFAVSHGRESKPLTAQQTTKIi \DAPTNLQFVN 

etdstvlvrwtppraqitgyrltvgltrrgqprqynvgpsvsky 
plrnlqpaseytvslvaikgnqespkatgvfttlopgssippyn 
tevtettivitwtpaprigfklgvrpsqggeaprevtsdsgsiv 
vsgltpgve yvytiqvlrdgqerdap \ i vnk \ wtpls pptnlh 
leanpdtgvltvswersttpditgyritttptngqqgnsleew 

HADQSSCTF\DNLEVPGLEYNVSVYTVKDDKESVPISDTI I PAV 
PPPTDLRFTN/ILGPDTMRVTW\APPPSIDLTNFLVRYSPVKNE 
GRMLQSLSIFFLSDN\AWLTNLLPGTEYWSVSSVYEQHESTP 
\LRGRQKTGLDSP\TGIDFS\DITA\WSFT\VHW\IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 
S1VALNGREESPLLIGQQSTVSDVPRDLEWAATPTSLLI\SWD 
APAVTVRYYR I T YGETGGNS PVQEFTVPGSKS TATISGLKPGVD 
YT I TVYAVTGRGDS PAS S KP I S INYRTE I DKPS QMQVTDVQDNS 
ISVKWLPSSS PVTG YRVTTT\ PKNGPG\ ptktktagpdqtemti 
EGLQPTVEYWSVYAQNPSGESQPLVQTAVTNIDRPKGLAFTDV 
DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSEYTVS WALHDDMESQPL IGTQSTAI PAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
SWVSGLMVATKYEVSVYALKDTLTSRPAQGWTTLENVSPPRR 
ARVTDATETT IT I S WRTKTET I TGFQVDAVPANGQTP I QRTI KP 
DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 
NLRFLATTPNSIiLVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQP SVGQQM I FE EHGFRRTT P PTTAT P I RHRPRP Y P PNVGQE 
ALSQTTISWAPFQDTSEYI I SCHPVGTDEEPLQFRVPGTSTSAT 
LTGL TRGATYN 1 1 VEALKDQQRHKVREEWTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 
S S R W CHDNGVNY K I GEKW DRQGENGQMMS CTCLGNG KG EFKCD P 
HE AT C YDDGKT YH VGEQWQ KEYLGA I CS CTC FGGQRGWRCDNCR 
RPGGE PS PEGTTGQS YNQYSQRYHQRTKTNVNCP I EC FMPLDVQ 
ADREDSRE 


S36J5 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RGPGPGLLLLAVLCLGTAVPSTGASKS KRQAQQMVQPQS PVAVS 
QS KPGCYDNGKHYQ INQQWERTYLGNALVCTCYGGSRGFNCES K 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGWGKGEWT 
CKPIAEKC FDHAAGT S YWGETWE KPYQGWMMVDCTCLGEGSGR 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline t Q=Glutamine, R=Arginine, 1 
S=Serine, T=*Threonine, V«Valine, 
W«Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








I TCTSRNRCNDQDTRTS YR IGDTWS KKDNRGNLLQCICTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYSVGMQLA* KTQGNKQML\ CTCLGNGVSCQETAVTQTYG 
GNSNGEPCVLP FTYNGRTF YS CTTEGRQDGHLWCSTTSNYEQDQ 
KYS FCTDHTVL VQTRGGNSNGALCHFPFL YNNHNYTD CTS EGRR 
DNMKWCGTTQNYDADQKFGFCPMAAHEEICTTNEGVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNW 
DT FHKRHE EGHMLNCTC FGQGRGRW KCD PVDQ CQDSE TGTF YQ I 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNSYTI KGLKPGWYEGQLIS IQQYGKQEVTRFDFTTTSTST 
PVTSNT\ VTGETTPFS PLVATS ES VTE ITAS S F WSWVS ASDTV 
SGFR VE YELSEEGDEPQ YLVLPS TATS V\NI P \ DLLPGRKYI VN 
VYQI SEDGEQSL I LSTSQTTAPDAP PDPTVDQVDDTS I WRWSR 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
KVTI MWTP PE S AVTG YRVD VI P VNL P GEHGQRLPLSRNTF \ AEN 
TGLS PG VTY Y FKVFAVSHGRES KPLTAQQTTKL \ DAPTNLQ FVN 
ETDSTVLVRWTP PRAQITGYRLTVGLTRRGQPRQ YNVGP S VS KY 
PLRNLQPASEYTVSLVAIKGNQES PKATG VFTTLQ PGS S I P P YN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVEYVYTIQVLRDGQERDAP \ I VNK\ WTPLSPPTNLH 
LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEW 
HADQSSCTF\DNIjEVPGLEYNVSVYTVKDDKESVPISDTIIPAV 
PPPTDLRFTN/ 1 LGPDTMRVTW\APP PS IDLTNFLVRYS PVKNE 
GRMLQS LS I F FLS DN\AWLTNLL PG TE YWS VS S V YEQHE STP 
\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNIiTPGTEYW 
SIVALNGREESPIiLIGQQSTVSDVPRDLEWAATPTSLLl\SWD 
APAVTVRYYRIT YGETGGNS P VQEFTVPGS KSTAT ISGLKPGVD 
YTITVYAVTGRGDS PASSKP ISINYRTEIDKPSQMQVTDVQDNS 
IS VKWLPSSS PVTG YRVTTT \ P KNG PG \ PTKTKTAG PDQTEMT I 
EGLQ PT VE YWS VYAQNPSGES QPLVQTAVTN I DRPKGLAFTD V 
DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDIiKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
SWVSGLMVATKYEVSVYALKDTLTSRPAQGWTTLENVSPPRR 
ARVTDATETT I T I S WRTKTET I TGFQVDAVPANGQTP IQRT I KP 
DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 
NLRFLATTPNS LLVS WQP PRARI TGY 1 1 KYEKPGS PPRE WPRP 
RPGVTEATI TGLE PGTE YTI YVIALKNNQKSEPLIGRKKTDELP 
QLVTL PHPNLHGPE ILDVPSTVQKTPFVTHPGYDTGNG IQLPGT 
SGQQ PS VGQQM I FEEHGFRRTTPPTTATP I RHRPRP YP PNVGQE 
AXiSQTT I S WAP FQDTS E Y 1 1 S CHP VGTDEE PLQFRVPGTSTS AT 
LTGLTRGAT YN 1 1 VE ALKDQQRHKVREE WTVGNS VNEGLNQP T 
DDS CFDP YTVSH YAVGDEWBRMS E SGFKLLCQCLGFGS GHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
R PGGE P S P EGTTG QS YNQYS QR YHQRTNTNVNCP I E CFM PLDVQ 
AD REDS RE 


5366 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
fro WKKy V f 1 KKDr S RRLRRE AN Li VATCL r VRAS LPHRLNM L 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
Q S KPG C YDNGKHYQ I NQQWERT YLGNALVCTC YGGS RG FNCES K 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CT I ANRCHEGG QS YK I GDTWRR PHETGG YMLE CV CLGNG KGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCIjGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNIiLQCICTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYSVGMQLA*KTQGNKQML\CTCLGNGVSCQETAVTQTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
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SEQ 
ID 

.NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, Islsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R&Arginine, 
S^Serine, T=Threonine , V»Valine, 
W»Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KYSFCTDHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 

DNMKWCGTTQN YDADQKFGFCPMAAHE E I CTTNEGVM YRI GDQW 

DKQHDMGHMMRCTCVGWGRGEWTCIAYSQLRDQCIVDDITYNVW 

DTFHKRHE E GHMLNCTC FGQGRGRW KCD P VDQCQDS E TGTFYQ I 

GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 

TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 

GHLNS YT I KGLKPG WYEGQL I S I QQ YGHQE VTRFDFTTT S TS T 

PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 

SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 

VYQ IS EDGEQSL I LS TS QTTAPDAP PDPTVDQVDDTS I WRWSR 

PQAP I TG YR I VYS PS VEGS STELNLPETANS VTLSDLQPGVQYN 

I T I YAVEENQESTP WI QQETTGTPRSDTVPS PRDLQ FVEVTDV 

KVTIMWTPPESAVTGYRVDVIPVNLPGEHGQRLPLSRWTF\AEN 

TGLSPGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAPTNLQFVN 

ETDS T VLVRWTP P RAQ I TG YRLTVGLTRRGQPRQ YNVGPS VS KY 

PLRNLQPASEYTVSLVAIKGNQESPKATGVFTTLQPGSS I PPYN 

TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 

VSGLTPGVEYVYTIOVLRDGQERDAP\ IVNK\ WTPLSPPTNLH 

LE ANPDTGVLTVS WERSTT P DITG YRI TTTPTNGQQGNS LE E W 

HADQS S CT F \ DNL E VPGLE YNVS VYT VKDDKE S VPI S DT 1 1 P AV 

PP P TDIiRFTN / 1 LGPDTMR VTW \ AP P P S I DLTNFL VR YS P VKNE 

GRMLQSLS I FFLSDN\AWLTNLLPGTE YWSVSS VYEQHESTP 

\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 

TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 

SIVALNGREESPLLIGQQSTVSDVPRDLEWAATPTSLLl\SWD 

APAVTVRYYRITYGETGGNSPVQEFTVPGSKSTATISGLKPGVD 

YT ITVYAVTGRGDS PAS S KP I S INYRTE IDKPSQMQVTDVQDNS 

ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKTKTAGPDQTEMTI 

EGLQPTVEYWSVYAQNPSGESQPLVQTAVTNIDRPKGLAFTDV 

DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 

ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFT 

QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 

S WVS GLMVATK YE VS VYALKDTL TSRPAQG WTTL ENVS P PRR 

ARVTDATETTI T ISWRTKTET I TG FQ VDAVPANGQTP I QRT I KP 

DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 

NLRFLATTPNSLLVS WQPPRARITGYI I KYEKPGS P PREWPRP 

RPGVTEATITGLEPGTEYT I YVIALKNNQKSE PL IGRKKTDELP 

QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 

SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 

ALSQTTIS WAPFQDTSEYI IS CHPVGTDEEPLQFRVPGTS TSAT 

LTGLTRGAT YNI I VEALKDQQRHKVRE E WTVGNS VNEGLNQ PT 

DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 

S S RWCHDNG VN YKI GE KWDRQGENGQMMS CTCLGNGKGE FKCDP 

HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 

RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 

ADREDSRE 


5367 


235 


3591 


KK1LNMLCKKNIVIEYLADILYEYLYGFCFSGIKKYLIIHVLRL 
ILELWMTRLLLEKSVSLQTQYLLLIVKILSWFPGKEMRHHLQIM 
EVMMRKQDS /RI VGNGS EQQLQKELADVLMDPPMDDQPGEKELV 
KRSQLDGEGDGPLSNQLSASSTINPVPLVGLQKPEMSLPVKPGQ 
GDSEAS S P FTP VADEDS WFS KLT YLGCAS VNAPRS EVEALRMM 
S I LRSQCQ I SLDVTLSVPNVS EG I VRLLDPQTNTE I ANYP I YKI 
L FCVRGHDGT P ES DCFAFTES H YNAEL FR I HVFRCE IQE AVSR I 
LYSFATAFRRSAKQTPLSATAAPQTPDSDI FTFSVSLE I KEDDG 
KG YFSAVPKDKDRQCFKLRQG IDKKI VI YVQQTTNKELAI ERCF 
GLLLSPGKDVRNSDMHLLDLESMGKSSDGKS YVITGS WNP KS PH 
FQ WNEE TP KD KVLFMTTAVDL V I TE VQE P VRFLLETKVR VCS P 
NERLFWPFS KRSTTENFFLKLKQ I KQRER KNNTDTL YE WCLES 
ESERERRKTTASPSVRLPQSGSQSSVIPSPPEDDEEEDNDEPLL 
S GSGDVS KE CAEKI LET WGELLS KWHLNLNVRPKQLSS L VRNG V 
P EALRGE VWQLLAG CHNNDHLVEKYR I L I TKES PQDS AI TRDI N 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c or re spending 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D«*Aspartic Acid, B« 
Glutamic Acid, F»Phenyl alanine, GMSlycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=*Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *s*Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTFPAHDYFKDTGGDGQDSLYKICKAYSVYDEEIGYCQGQSFLA 
AVLLLHMPEEQAFSVLVKIMFDYGLRBLFKQNFEDLHCKFYQLE 
RLMQEYIPDLYNHFLDISLEAHMYASQWFLTLFTAKFPLYMVFH 
I IDLLLCEGISVTFNVALGLLKTS KDDLLLTDFEGALKFFRVQL 
PKRYRSEENAKKLMELACNMKISQKKLKKYEKEYHTMREQQAQQ 
EDP I ERFERENRRLQEANMRLEQENDDLAHELVTS KI ALRKDLD 
NAEEKADALNKELLMTKQKLIDAEEEKRRLEEESAHLKKMCRRE 
LDKAESEIKKNSSIIGDYKQICSQLSERLEKQQTANKVEIEKIR 
QKVDDCERCREFFNKEGRWGISSTKEVLDEDTDEEKETLKNQL 
REMELELAQTKL\QLVEAECKIQD\LEHPF*GLPFNE\VQAA\K 
KTWFNRTLSS IKTATGVQGKETC 


5368 


573 


2014 


GAAAGAADPRRGS LGGRTMLDFAI FAVTFLLALVGAVLYLYPAS 
RQAAGIPGITPTEEKDGNLPDIVNSGSLHEFLVNLHERYGPWS 
FWFGRRLWSLGTVDVLKQHINPNKTLD/LF*NHAEVIIKVSIW 
WWQCE*KP\QRKKLYENGVTDSLKSNFALLLKLPEBLLDKWLSY 
PETQH\VPLSQHMLGFAMKSVTQMVMGSTFEDDQEVIRFQKNHG 
TVWSE IGKG FLDGSLDKNMTRKKQYEDALMQLES VLRNI I KERK 
GRNFSQHI FIDS LVQGNLNDQQ I LEDSM I FS LAS CI I TAKLCTW 
AIWFLTTSEEVQKFCLYEEINQVFGNGPVTPEKIEQLRYCQHVLC 
ETVRTAKLTP VSAQLQDI EGK1DRF 1 1 PRETLVL YALGWLQDP 
NTWPSPHKFDPDRFDDELVMKTFSSLGFSGTQECPELRFAYMVT 
TVLLS VLVKRLHLLSVEGQVIETKYELVTS SREEAWITVS KRY 


5369 


1 


6622 


PRS L C F S LWAE AAVIiADGGLRRRRRliLRGTMS AS F VPNG AS LED 
CKCNLFCLADLTG I KWKKYVWQGPTSAP ILFPVTEED P ILS SFS 
RCLKADVLG/ VWRRDQRPERRE\L * I FWGGEDP\ VLLTLFTMTY 
QKKKMECGRMDF PMNAVLCFSKAVHNLLERCLMNRNFVRI GKWF 
VKP YEKDEKP INKSEHLS CS FTFFLHGDSNVCTS VE INQHQPVY 
LLS EEHI TLAQQSNSP FQVI LCP FGLNGTLTGQAFKMSDSATKK 
LIGEWKQFYPISCCLKEMSEEKQEDMDWEDDSLAAVEVLVAGVR 
MIYPACFVLVPQSDIPTPSPVGSTHCSSSCLGVHQVPASTRDPA 
MSSVTLTPPTSPEEVQTVDPQSVQKWVKFSSVSDGFNSDSTSHH 
GGKI PRKLANHVVDRVWQECNMNRAQNKRKYSASSGGLCEEATA 
AKVASWDFVEATQRTNCSCLRHKNLKSRNAGQQGQAPSLGQQQQ 
ILPKHKTNEKQEKSEKPQKRPLTPFHHRVSVSDDVGMD\ADS\A 
SQRLV\ I SAP \ DS Q \ VR F SN I R\ TNDVAK\ TPQMHGTEMANS PQ 
PPPLSP\HPCDWDEGVTKTPSTPQSQHFYQMPTPDPLVPSKPM 
EDRIDSLSQSFPPQYQEAVEPTVYVGTAVNLEEDEANIAWKYYK 
FPKKKDVEFLPPQLPSDKFKDDPVGPFGQESVTSVTELMVQCKK 
PLKVS DELVQQYQI KNQCLS AIASDAEQEPKIDP YAFVEGDEEF 
LFPDKKDRQNSEREAGKKHKVEDGTSSVTVLSHEEDAMSIiFSPS 
IKQDAPRPTSHARPPSTS L I YDSDLAVS YTDLDNLFNSDEDELT 
PGS KRS ANGS DD KAS CKE S KTGNLD PLS CIS TADLHKM Y P T P PS 
LEQH I MG FS PMNMNNKE YGSMDTTPGGTVLEGNS SS IGAQ FKI E 
VDEG FCS P KPS E I KDFS YVYKPENCQ I L VGCS MFAP LKTLPS Q Y 
LPLIKLPEECIYRQSWTVGKLELLSSGPSMPFIKEGDGSNMDQE 
YGTAYTPQTHTSCGMPPSSAPPSNSGAGILPSPSTPRFPTPRTP 
RTPRTPRGAGGPASAQGS VKYENSDLYS PASTPS TCRPLNS VEP 
ATVPSIPEAHSLYVNLILSESVMNLFKDCNSDSCCICVCNMNIK 
GADVGVYIPDPTQEAQYRCTCGFSAVMNRKFGNNSGLFFEDELD 
I IGRNTDCGKEABKRFEALRATSAEHVNGGLKES EKLSDDL ILL 
LQDQCTNLFS PFGAADQD PFPKSG VI SNWVRVEERDCCND C YLA 
LEHGRQFMDNMSGGKVDEALVKSSCLHPWSKRNDVSMQCSQDIL 
RMLLSLQPVLQDAIQKKRTVRPWGVQGPLTWQQFHKMAGRGSYG 
TDESPEPLPIPT FLLG YD YDYLVLS PFAL P YWE RLMLE PYG S QR 
DIAYWLCPENEALLNGAKSFFRDLTAIYESCRLGQHRPVSRLL 
TDG IMRVGS TAS KKLS EKLVAE W FS QAADGNNEAFSKLKL YAQV 
CRYDLGPYLASLPLDSSLLSQPNLVAPTSQSLITPPQMTNTGNA 
NT PSATLASAASS TMTVTSGVAI S T S VATANSTLTTAS TS S S S S 
SNLNSGVS SNKLP3 FPPFGSMNSNAAGSMSTQANTVQSGQLGGQ 
QTSAIiQTAGISGESS S LPTQPHPD VSESTMDRDKVG IPTDGDSH 
AVTYPPAIVVYIIDPFTYENTDESTNSSSVWTLGLLRCFLEMVQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F-Phenylalanine, G*Glycine, 
H^Histidine, I«Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline # Q sGlut amine , R=Arginine, 
S=Serine, T=Threonine, v= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TL P PH I KS TVS VQ IIP CQ YLLQ P VKH EDRE I YPQHLKS 1»AFS AF 
TQCRRPLPTSTNVKTLTGFGPGLAMETALRSPDRPECIRLYAPP 
FILAPVKDKQTELGETFGEAGQKYNVLFVGYCLSHDQRWILASC 
TDLYGELLETCIINIDVPNRARRKKSSARKFGLQKLWEWCLGLV 
QMSSLPWRWIGRLGRIGHGELKDWSCLLSRRNLQSLSKRLKDM 
CRM CG I S AAD S PS I LS ACLVAME PQGS FV I M PD S VS TGS VFGRS 
TTLNMQTSQLNTPQDTSCTHILVFPTSASVQVASATYTTENLDL 
AFNPNNDGADGMGIFDLLDTGDDLDPDI INILPASPTGSPVHSP 
GSHYPHGGDAGKGQSTDRLLSTEPHEEVPNIIiQQPIiALGYFVST 
AKAGPLPDWFWSACPQAQYQCPLFLKASLHLHVPSVQSDELLHS 
KHSHPLDSNQTSDVLRFVLEQYNALSWLTCDPATQDRRSCIjPIH 
FWLNQLYNFIMNML 


5370 


1226 


716 


RWSRKLELRRAAQATESRPPQSQEMHPPTGKEVHALKRLRDSAN 
ANDVETVQQLLEDGADPCAADDKGRTALHFASCNGNDQIVQLLL 
DHGADPNQRDGIiGNT PLHLAACTNHVP VI TTLLRGGAR VDALDR 
AGRTP LHLAKS KLN I LQEGHAQ C LKAVR / HGGEADH P Y AEGVS G 
APRAT*AARCSGVFPSPSRWLGSAPWSRSSCTIWSLPLHEAKCR 
AVRPLSSAAQGSAPSSSSCCTVSTSLALAESLSLFRACTSLPVG 
GCISWL 


" 5371 


1331 


167 


IAAMLWKLLLRSQSCRLCSFRKMRSPPKYRPFLACFTYTTDKQS 
S KENTRTVE KL YKCS VD I RKI RR \ * KDGY F * RMKPMLKKLRI / F 
LQELGADETAVASILERCPEAIVCSPTAVNTQRKLWQLVCKNEE 
ELIKLIEQFPESFFTIKDQENQKLNVQFFQELGLKNVVISRIiLT 
AAPNVFHNPVEKNKQMVRILQESYLDVGGSEANMKVWLLKLLSQ 
NPFILLNS PTAI KETLEFLQEQGFTS FE I LQLLSKLKG FLFQLC 
PRS I QNS ISFS KNAFKCTDHDLKQLVLKCPALL YYS VPVLEERM 
QGLLREGIS IAQI RETPMVLELTPQIVQYRIRKLNSSGYR I KDG 
HLANLNGS KKE FEANFGK I QAKKVRPLFNPVAPLNVEE 


5372 


51 


857 


SPGAQFLWAAPDMPDPLFSAVQGKDEILHKALCFCPWLGKGGME 
PLRLLILLFVTELSGAHNTTVFQGVAGQSLQVSCPYDSMKHWGR 
RKAWCRQLGEKGPCQRWSTHNLWLLSFLRRWNGSTAITDDTLG 
GTLTITLRNLQPHDAGLYQCQSLHGSEADTLRKVLVEVLADPLD 
HRDAGDLWFPG\DLRASRMPMWSTASPGASWKEKSPSHPLPSFS 
S W P AS FS SRF * QP AP SGLQ PGMDRSQGH I HPVNWTVAMTQG I S S 
KLCQG 


5373 


2814 


346 


VKKTKS I FNSAMQEMEVYVENIRRKFGVFNYS PFRTPYTPNSQY 
QMLLDPTNPSAGTAKIDKQEKVKLNFDMTASPKILMSKPVLSGG 
TGRRISLSDMPRSPMSTNSSVHTGSDVEQDAEKKATSSHFSASE 
ESMDFLDKS TAS PASTKTGQAGSLSGS PKPFS PQLSAP ITTKTD 
KTSTTGSILNLNLDRSKAEMDLKELSESVQQQSTPVPLISPKRQ 
IRSRFQLNLDKTIESCKAQLGINEISEDVYTAVEHSDSEDSEKS 
DSSDSEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
TNPVEIKEELKSTSPASEKADPGAVKDKASPEPEKDFSGKAKPS 
PHPIKDKLKGKDETDSPTVHLGLDSDSE\NELVIDLGEDHSGRE 
GRKNKKEPKE PS P KQDWGKTPPS TTVGSHS P PETPVLTRS S AQ 
TS AAGATATTS TS STVTVTAPAPAATGS P VKKQRPLLPKE \ TAP 
AVQRSCGTSSTVQQKEITQSPSTSTITLVTSTQSSPLVTSSGSM 
STLVSSVNGDLPIGTASADVAADIAKYTSKL\MDAIKGTM\TEI 
YNDLSKN\TTWKAQIAEDSO/3I^IEIEKLQWLHQQEL\SEMKHN 
LELTMAEMRQSWEQERDRLIAEVKKQLELEKQQAVDETKKKQWC 
ANFKKEAI FYCCWNTS YCD YPCQ\ QAHWPEH\MKSCTQSATAPQ 
\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 
EKSKESGSTLDLSGSRETPSSILLGSNQGSDHSR\SNKSSWSSS 
DEKRGS\TRSDHN/TPSTQHGRSLLPGKESRAGTPFLGTSK 


£374 


2814 


346 


VKKTKS I FNSAMQEMEVYVENIRRKFGVFNYS PFRTPYTPNSQY* 
QMLLDPTNPSAGTAKIDKQEKVKLNFDMTASPKILMSKPVLSGG 
TGRRISLSDMPRSPMSTNSSVHTGSDVEQDAEKKATSSHFSASE 
ESMDFLDKSTAS PASTKTGQAGSLSGS PKPFS PQLSM ITTKTD 
KTSTTGS ILNLNLDRSKAEMDLKE LSES VQQQSTPVPLI S PKRQ 
IRSRFQLNLDKTIESCKAQLGINEISEDVYTAVEHSDSEDSEKS 
DSSDSEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine , M=*Me thionine , N^Asparagine , 
P»Proline, Q«Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNPVEIKEBLK5TSPASEKADPGAVKDKASPEPEKDPSGKAKPS 
PHPIKDKLKGKDETDSPTVHLGLDSDSE\NELVIDLGEDHSGRE 
GRKNKKEPKEPSPKQDWGKTPPSTTVGSHSPPETPVLTRSSAQ 
TS AAGATATTS TS S TVT VTAP AP AATGS P VKKQRPL L P KE \ TAP 
AVQRSCGTSSTVQQKEITQSPSTSTITLVTSTQSSPLVTSSGSM 
STLVSSVNGDLP IGTASADVAADI AKYTSKL\MDAI KGTM \TE I 
YNDLSKN\TTWKAQLAEDSQGLRIEIEKLQWLHQQEL\SEMKHN 
LELTMAEMRQSWEQERDRLIAEVKKQLELEKQQAVDETKKKQWC 
ANFKKEAIPYCCWNTSYCDYPCQ\QAHWPEH\MKSCTQSATAPQ 
\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 
EKS KESGSTLDLSGSRETPSS ILLGSNQGSDHSR \ SNKS S WSSS 
DEKRGS \TRSDHN/ TPSTQHGRSLLPGKESRAGTPFLGTSK 


5375 


2907 


1116 


hiflaeeepmlerrcrgplamgpaqprllsgpsqespqtlgkes 
rglrqqgtsva\qsgaqapgrahrcahcrrhfpgwva\lwlhtr 
rcq a/ rgl pl pc pecg rr frhap flalhrq vhaaatpdwg fach 
lcgqs frg wvalvlhiirahsaakagp facp kmardafwrrkaas 
ss i lrrchpsrprgprpf i cgncgrs i lptwdq / lkvahkrvhv 
srrp*ergppakvfwgprprgpptgdtppgpggdavdrpf\qca 
ccgkrfrhk\pwlirshaacrsgerphq/csrecg\krftnkpy 
lts \hrr ithtarqp y p cke cgrr frhkpnlls h s kihkrs egs 
aqaapgpgspqlpagpqesaaeptpavplkpaqepppgappehp 
qdpieappslyscddcgrsfrlbrflrahqrqhtgerpftcaec 
gknfgkkthlvahsrvhsgerpfrlarkcgrrflprasqsggrn 
saepnaprfgpfvcpdcgkafrhkpylaahrpiatpaekpyvcp 
dcrkafsqksnlWshrrihtgerpyacpdcdrsfsqksnlith 
rkshi rdgafccai cgqtfddeerllahqkkhdv 


5376 


4504 


591 


VSTFSLCLWPAGGGGRGRVSNMAQSKRHVYSRTPSGSRMSAEAS 
ARPLRVGSRVEVIGKGHRGTVAYVGATLFATGKWVGVILDEAKG 
KNDGTVQGRKY FTCDEGHG I FVRQS Q I Q VFEDGADTTS P ET P DS 
S ASKVLKREGTDTTAKTS KLRGLKP KKAPTARKTTTRRP KPTRP 
ASTGVAGAS S S LG P SGS AS AGELS S S E PST PAQT PLAAP 1 1 P TP 
VLTSPGAVPPLPSPSKEEEGLRAQVRDLEEKLETLRLKRAEDKA 
KL KEh EKHK I QLEQ VQ E W KS KMQEQQADLQRRL KEARKEAKEAL 
EAKERYMEEMADTADAIEMATLDKEMAEERAESLQQEVEALKER 
VDE LTTDLE I LKAE I E E KGSDGAAS S YQLKQL EE QNARL KDALV 
RMRDLSSSEKQEHVK\LQKLMEKKNQELEWRQQRERLQEELSQ 
AES TIDE LKEQVDAALGAEEMVEMLTDRNLNLEEKVRELRE TVG 
DLEAMNEMNDELQENARETELELREQLDMAGARVREAQKRVEAA 
QETVAD YQQT I KKYRQLTAHIjQDVNRE LTNQQEAS VERQQQ PP P 
ETFDFKI KFAETKAHAKAI EMELRQMEVAQANRHMS LLTAFMPD 
SFLRPGGDHDCVLVLLLMPRLICKAELIRKQAQEKFELSENCSE 
RPGLRGAAGEQLS FAAI GLVY\ S LMPAAGHRYHR Y * CHALSQCR 
LD \ VYKKVGS LY P EMSAHERS LD FL I E LLHKDQLDET VNVE PLT 
KAIKYYQHL YS IHLAEQPEDCTMQLADHI KFTQSALDCMS VE VG 
RLRAFLQGGQRATD IALLLRDLETS CS \DIRQFCKKI RRRMPGT 
DAPGI PAALAFGPQVSDTLLDCRKHLTWWAVLQEVAAAAAQLI 
APIiAENEGLLVAALEEIiAFKAS EQI YGTPSSS P YECLRQS CNI L 
ISTMNK\LVTAMQEGEYDAERPPSKPPP\VELRAAALRAEITDA 
EGI^LKLEDRETVIKELKKSLKlKGEEliSEANVRLTLLEKKLDS 
AAKDADER I E KVQTRLEETQALLRKKE KE FEETMDALQAD I DQ L 
EAEKAEL KQRLNSQ S KRT I EG LRGPPP S G I ATL VSG I AGEEQQR 
GAIPGQAPGSVPGPGLVKDSPLLLQQISAMRLHISQLQHENSIL 
KGAQMK7VSLASLPPIjHVAKLSHEGPGSELPAGALYRKTSQLLET 
LNQLSTHTHWDITRTSPAAKSPSAQLMEQVAQLKSLSDTVEKL 
KDEVLKETVSQRPGATVPTDFATFPSSAFLRAKEEQQDDTVYMG 
KVTFSCAAGFGQRHRLVLTQEQIiHQIiHSRLIS 


5377 


762 


1106 


DVPCKRVLPAEAQEKGQLTLSCGESGEEG\F*YHEVRQAEGES* 
/ WFG PNVRLVHTQLKTKKPSGTLKAKF YLHTGSTKFAARIS CTK 
SS*WPGYDGWWGGQYIFIFRGMRWEEQP 


537B 


2009 


664 


QASGTTLRPLPDLPQLKRREATSRNRALKPRGRIjVLMTSCLPAIj 
RFI AT PRliS AM PH I DNDVKLD FKDVLLR PKRSTLKS RS E VDLTR 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=*Isoleucine," K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T»Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






I 


SFSFRNSKQTYSGVPI IAANMDTVGTFEMAKVLCKS * VPGSFWD 
VPQMGCVFLIYKLFTLKWKMLLLSVLLPASILVAEKFSLFTAVH 
KHYS L VQWQE FAGQN P DCLBHLAAS SGTGS SDFEQLEQ I LEAI P 
QVKYICLDVANGYSEHFVEFVKDVRKRFPQHTIMAGNVVTGEMV 
E E L I LS GAD 1 1 KVG I G PGS VCTTR KKTGVG YPQLSAVM ECADAA 
HGLKGHI I S DGGCS C PGD VAKAFG AGADF VMLGGMIAGHS ES GG 
ELIERDGKKYKLFYGMSS* I \AM\KKYAGGVAEYRASEGKTVEV 
P FKGD VEHT I RDI LGG IRS TCTYVGAAKLKELS RRTTF I R VTQQ 
VNPIFSEAC 


5379 


2009 


664 


QAS GTTLR P L P DLPQLKRRE ATS RNRAL KP RGRLVLMTS C LPAL 
R FIATPRLS AMPHI DNDVFCLD FKD VLLR PKRS TLKSRSE VDLTR 
S FS FRNS KQT YSGVP 1 1 AANMDTVGTFEMAKVLCKS * VPGS FWD 
VPQMGCVFLIYKLFTLKWKMLLLSVLLPASILVAEKFSLFTAVH 
KHYSLVQWQEFAGQNFDCLEHLAASSGTGSSDFEQLEQILEAIP 
Q VK Y I CLDVANGYS EHFVEFVKDVRKRFPQHTIMAGNVVTGEMV 
EELILSGADI I KVGIGPGSVCTTRKKTGVGYPQLSAVMECADAA 
HGLKGHI I S DGGCS CPGDVAKAFGAGADFVMLGGMLAGHS ESGG 
ELIERDGKKYKLFYGMSS * I\AM\KKYAGGVAEYRASEGKTVEV 
P FKGD VEHT I RDILGG 1 RST CT YVGAAKLKELSRRTTF 1 R VTQQ 
VNPIFSEAC 


5380 


2 


2050 


P SRAGGAERGRAAAARS PGGS AAGWECPS VLDEAGACTMS SCVS 
SQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSLSIHLGME 
SFIWTECEPGCAVDLGLARDRPLEADGQEVPLDTSGSQARPHL 
SGRKLSLQERSQGGLAAGGSLDMNGRCICPSLPYSPVSSPQSSP 

rlprrptveshhvs i tgmqdcvqlnqytlkde igkgs yg wkla 
ynendntyyamkvlskkklirqaafprrppprgtrpapggciqp 

RG P I \ EQVYQE I A\ I LKKLDH PNW \ KLVE VL \ DDPN E DHL YMV 

f\elvnqgpvmevptlkplsedqarfyfqdlikgieylhyqkii 
h\rdikpsnllvgedghikiadfgvsnefkgsdallsntvgtpa 
fmapeslsetrkifsgkaldvwamgvtlycfvfg*cpfmderim 
clhskiksqalefpdqpdiaedlkdlitrmldknpesriwpei 
klhpwvtrhgaeplpsedenctlvevteeevensvkhipslatv 
ilvktmirkrsfgnpfegsrreerslsapgnlltkkptrecesl 
selkt*kisplpacckvt*efphpsgcrpscwqppflhthsqpr 
*pepprtdealcpyetgrtcwapllqvlwwvgtplpfplstswl 
pdlvgapgshfcflniallrynshtm 


5381 


2 


2050 


psraggaergraaaarspggsaagwecpsvldeagactmsscvs 
sqpssnraapqdelggrgssssesqkpcealrglsslsihlgme 
sfiwtecepgcavdlglardrpleadgqevpldtsgsqarphl 
sgrkls lqersqgglaaggsldmngrci cpslp ys pvs s pqssp 
rlprrptveshhvsitgmqdcvqlnqytlkdeigkgsygwkla 
ynendntyyamkvlskkklirqaafprrppprgtrpapggciqp 
rgp I \ EQVYQE I a\ i lkkldhpnw\ klvevl \ dd pnedhlymv 
f\elvnqgpvmevptlkplsedqarfyfqdlikgieylhyqkii 
h\rdikpsnllvgedghikiadfgvsnefkgsdallsntvgtpa 
fkapeslsetrkifsgkaldvwamgvtlycfvfg*cpfmderim 
clhskiksqalefpdqpdiaedlkdlitrmldknpesriwpei 
klhpwvtrhgaeplpsedenctlvevteeevensvkhipslatv 
ilvktmirkrsfgnpfegsrreerslsapgnlltkkptrecesl 
selkt*kisplpacckvt*efphpsgcrpscwqppflhthsqpr 
♦pepprtdealcpyetgrtcwapllqvlwwvgtplpfplstswl 
pdlvgapgshfcflniallrynshtm . 


5382 


1536 


203 


gargsqqdapalqeaevrgperaqpargrmtkarlfrlwlvlgs 
vfmilliivywdsagaahfylhtsfsrphtgpplptpgpdrdre 
ltadsdvdefldkflsagvkqsdlprketeqppapgsmeesvrg 
ydwsprdarrspdqgrqqaerrsvlrgfcansslafptkerpfd 
dipnselshlivddrhgaiycyvpkvactnwkrvmivlsgsllh 
rgapyrdplri prehvhnasahltfnkfwrrygklsrhlmkvkl 
kkytkflfvrdpfvrlisafrskfeleneef/*pqvrrahaaav 
rqphqparlgarglprwpq\vsfanfiqylldphteklapfneh 
wrqvyrlchpcqidydfvgkletldedaaqllqllqvdlaaplp 
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SEQ 
ID 
NO: 


rreaicteu 
beginning 

liVAV- J. CU L. 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

lOCau lull 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M~Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine ( T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PELPGTGPPSSWEEDWFAKIPLAWRQQLYKLYEADFVLFGiTPKP 
ENLLRD 


5383 


45 


5250 


VERLLGCRNS KRTWRML I S KNMP WRRLQG I S FGMY S AE E L KKLS 
VKS I TNPR YLDSLGNP S ANGL YDLALGPADS KE VC S TC VQDFSN 
CSGHLGH I EIiPL WYNPLLFD KL YLL LRGS CLNCHMLTC PRAVI 
HLLLCQLRVLEVGALQAVYELERI LS RFLEENADPS ASE I REEL 
EQYTTEIVQNNLLGSQGAHVKNVCESKSKLIALFWKAHMNAKRC 
PHCKTGRS WRKEHNSKLTITFPAMVHRTAGQKDSEPLG I EEAQ 
IGKRGYLTPTSAREHLSALWKNEGFFLNYLFSGMDDDGMESRFN 
PSVFFLDFLWPPSRSRPVSRLGDQMFTNGQTVNLQAVMKDWL 
IRKLLALMAQEQKLPEEVATPTTDEEKDSLIAIDRSFLSTLPGQ 
SLIDKLYNIWIRLQSHVNIVFDSEMDKLMMDKYPGIRQIIjEKKE 
GLFRKHMMGKRVDYAARSVICPDMYINTNE IG I PMVFATKLTYP 
QPVTPWNVQELRQAVINGPNVHPGASMVINEDGSRTALSAVDMT 
QREAVAKQLLTPATGAPKPQGTKIVCRHVKNGDILLLNRQPTLH 
RPSIQAHRARILPEEKVLRLHYANCKAYNADFDGDEMNAHFPQS 
ELGRAEAYVLACTDQQ Y LVPKDGQ PLAGL I QDHMVSGAS MTTRG 
CF FTREHYME LVYRGLTDKVGR VKLLS PS I LKP FP LWTGKQ WS 
TLLINI I PEDHI PLNLSGKAKITGKAWVKETPRSVPGFNPDSMC 
ESQVIIREGELLCGVLDKAHYGSSAYGLVHCCYEIYGGETSGKV 
LTCLARLFTAYLQL YRG FTLGVED I L VKP KADVKRQR 1 1 E E S TH 
CGPQAVRAALNLPEAASYDEVRGKWQDAHLGKDQRDFNMIDLKF 
KEEVNHYSNEINKACMPFGIiHRQFPENTLQLMVQSGAKGSTVNT 
MQ I S CLLGQ I ELEGRS T PLMASGKSLP CFEP YE FT PRAGG F VTG 
RFLTGIKPPEFFFHCMAGREGLVDTAVKTSRSGYLQRCIIKHLE 
GLWQYDLTVRDSDGSWQFLYGEDGLDIPKTQFLQPKQFPFLA 
SNYE V I M KSQHLHE VLS RAD P KKALHHFRAI KKWQS KH PNT LLR 
RGAFLSYSQKIQEAVKALKLESENRNGR/RPWDS/G/RMLRMWY 
ELDEESRRKYQKKAAACPDPSLSVWRPDIYFASVSETFETKVDD 
YSQEWAAQTEKSYEKSELSLDRLRTLLQL\KWQRSLCEPGEAVG 
LLAAQSIGEPSTQMTLNTFHFAGRGEMNVTLGIPRLREILMVAS 
ANIKTPMMSVPVLNTKKALKRVKSLKKQLTRVCLGEVLQKIDVQ 
fc.bi»Q,MEEK^NKFQVYQLRFQFLPHAYYQQEKCLRPEDILRFMET 
RFFKLLME S I KKKNNKAS AFRNVNTRRATQRDLDNAGELGR S RG 
EQEGDEEEEGHIVDAEAEEGDADASDAKRKEKQEEEVDYESEEE 
EEREGEENDDEDMQEERNPHREGARKTQEQDEEVGL/GH*GGPV 
P S RP PDAAPETHPQPGAPGA\ EAMERRVQAVRE I HPF I DD YQ YD 
TEESLWCQVT VKLPLMK I NFDMS SL WS LAHGAV I YATKG I TRC 

AIANTYG I EAALRVIEKE I KDVFAVYG IAVDPRHLS LVADYMCF 
EGVYKPLNRFG I RS NS S PLQQMTFETS FQ FLKQATMLGSHDE LR 
S PSACL WGKWRGGTGLFELKQPLR 


5384 


196 


886 


UouuU-ivLir l v Jj w l» w UJb'^tjoCrL.XJjbli r \ PGRPHALPEIRPYINI 
TILKGDKGDPGPMGLPGYMGREGPQGEPGPQGSKGDKGEMGSPG 
APCQKRFFAFSVGRKTALHSGEDFQTLLFERVFVNLDGCFDriAT 
GQFAAPLRGIYFFSLNVHSWNYKETYVHIMHNQKEAVILYAQPS 
EiKoimysUi vi»UjUii/ii LiJJKV W v KLic JvKyKbWAl xSNDFDTYITF 
5GHLIKAEDD 


5385 


326 


799 


LMVPRTKKEAPAPPKAEAKAKAL\KAKKAVLKDVHSHKKNKIHM 
SPTFRRPKTL*LRRQPKYPWKSTPRRNKLDHHVIIKFPLTTE*A 
VK2CI ENNS IiLVFTVDVKANKHQ I KQA VKK/ LCDI D VAKVNTL I Q 
SDGERKAYVRLAPD YDALWATKIGI T 


5386 


326 


799 


LMVPRTKKEAPAP PKAEAKAKAL \ KAKKAVLKDVHSHKKNKIHM 
SPTFRRPKTL*LRRQPKYPWKSTPRRNKLDHHVIIKFPLTTE*A 
VKKI ENNSLLVFTVDVKANKHQ I KQAVKK/LCD IDVAKVNTL IQ 
SDGERKAYVRLAPDYDALWATKIGIT 


5387 


2 


2117 


FWAASGGCWFVLGERRAGSLLSASYGTFAMPGMVLFGRRWAIA 
SDDLVFPGFFELWRVLWWIGILTLYLMHRGKLDCAGGALLSSY 
LIVLMILLAWICTVSAIMCVSMRGTICNPGPRKSMSKLLYIRL 
ALFFPEM VWAS LGAAWVADGVQCDRTWNG 1 1 ATWVS W 1 1 1 AA 
TWS 1 1 1 VFDPLGGKMAP YS SAG PSHLDSHDS SQLLNGLKTAAT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H=Histidine, I«Isoleucine, K«Lysine, 
L«Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q^Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, YeTyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SVWETR IKLLCCCIGKDDHTRVAFSS TAELFS TYFSDTDLVPSD 
I AAGLALLHQQQ DN IRNNQE P AQ VVCHAPG SS QEADLDAEL KNC 
HH YMQ FAAAAYG W P L Y I YRNPLTGLCRI GGDCCRSKNPQTMT /M 
VGGDQLQL / CTS AP I LHTHRAAVQGLH PRQ LP WTR FTE LP FLVA 
LDHRKESVWAVRGTMSLQDVLTDLSAESEVLDVECEVQDRLAH 
KG I SQAARYVYQRL INDG I LSQAFS I APE YRLVI VGHS LGGGAA 
ALLATMVRAAYPQVRCYAFS PPRGLWSKALQEYSQSFI VSLVLG 
KDV I PRLSVTNLE DLKRR I LR WAHCNKPKYK I LIjHGLWYEL FG 
GN PNNL PTE LDGGDQE VLTQ PLLGEQ S LLTRWS PAYSFS S DS PL 
DSSPKYPPLYPPGRIIHLQE EGAS GR FGCCS AAH YS AKW S HEAE 
FSKILIGPKMLTDHMPDILMRALDSWSDRAACVSCPAQGVSSV 
DVA 


5388 


15G9 


753 


TADGGAGGGGRRQAGVRRH YLYP FTGG YRRRRAACQAERPAARS 
KDTDLAAYQKGNLG VQLRNMAQETNHSQVPMLCS TGCG F YGN PR 
TNGMCS VCYKEHLQRQNS SNGRIS P P VQCTDGS VPEAQS ALDS T 
SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETEDVQASVS 
DTAQQP S EEQS KSLE \NRNKKRIAVSCAGRKWDLLGLNAGVEMF 
TWYTVTQMYTIALTITKQMLKNFVFQQEFKSFGSFHQQLLEYK 
ILEHLQTKN 


5389 


1569 


753 


TADGGAGGGGRRQAGVRRH YLYP FTGG YRRRRAACQAERPAARS 
KDTDLAAYQKGNLG VQ LRNMAQETNHS QVPMLCS TGCG F YGNPR 
TNGMCSVCYKEHLQRQNSSNGRISPPVQCTDGSVPEAQSALDST 
SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETEDVQASVS 
DTAQQPS EEQS KSLE \NRNKKRIAVS CAGRKWDLLGLNAGVEMF 
TVVYTVTQMYTIALTITKQMLKNFVFQQEFKSFGSFHQQLLEYK 
ILEHLQTKN 


5390 


217 


1332 


EDPRKLMEDKMWSECEGPEMSLVCLTDFQAHAREQLSKSTRDFI 
EGGADDS ITRDDN I AAF KR I RLRPRYLRDVSEVDTRTT IQGEE I 
SAP I C I APTGFHCLVWPDGEMSTARAAQAA\GI CYI TSTFAS CS 
LED I VIAAPEGLRWFQL YVHPDLQLNKQLIQRVESLGFKALVIT 
LDTPVCGNRRHD IRNQLRRNLTLTDLQS PKKGNAI P YFQMTP I S 
TSLCWNDLSWFQS I TRLP I ILKGILTKEDAELAVKHNVQGIIVS 
NHGGRQLDEVLAS IDALTE WAAVKGKI E VYLDGGVRTGNDVLK 
ALALGAKC I FLGDA I LWALAS KGEHG VKE VLN I LTNE FHTSMA\ 
LTGCRSVAEINRNLVQFSRL 


5391 


1 


1292 


ra<AAGRSRGPPTAGGQRCEEAPGTVMERRI^VRAWVKENRGSF 
Q PPVCNKIiMHQ EQL KVMF VGG PNTRKD YHI EEGE EVF YQLEGDM 
VLRVLEQGKHRDWIRQGE I FLLPARVPHSPQRFANTVGLWER 
RRLETELDGLRYYVGDTMDVLFEKWFYCKDLGTQLAPIIQBFFS 
SEQYRTGKPIPDQLLKEPPFPLSTRSIMEPMSLDAWLDSHHREL 
QAGTPLSLFGDTYETQVIAYGQGSSEGLRQNVDVWLWQLEGSSV 
VTMGGRRLSLGPWMDSLLVLSWGPS Y \AW\ ERTQGS VALSVT\Q 
DPACKKSPWGEPSCHGLKAATGVPSTLEVPSLPNNSPSPHYLSV 
YCRCVPHRPAHCCHPPSCPSQPRCHAPGRAAAPHLLWQTQPTAL 
PVLPGGLPPAPLLPIPLSLQTQCSTSTPRRPSIKAS 


5392 


1 


1623 


I RGSNAQKWGASGS GGAG PQPD PAG PGG VPALAAAVLGACE PR 
CAAPCPLPALSRCRGAGS RGSRGGRGAAG S GDAAAAAEW IRKGS 
FIHJCPAHGWLHPDARVLGPGVSYWRYMGCIEVLRSMRSLDFNT 
R TQVTREAINRLHEAVP<3 VRGS WKKKAPNKALAS VIjGKSNLR FA 
GMSISIHISTDGLSLSVPATRQVIANHHMPSISFASGGDTDMTD 
YVAYVAKDPINQRACHILECCEGL\AQS I ISTVGQAFELRFKQY 
LHSPPKVALPPERLAGPEESAWGDEEDSLEHNYYNSIPGKEPPL 
GGLVDSRLALTQPCALTALDQGPSPSLRDACSLPWDVGSTGTAP 
PGDGYVQADARGP P DHE EHL YVNTQGLDAP E PEDS P KKDLFDMR 
P FEDALKLHECS VAAG VTAAPL P LEDQ W P S P PTRRAP VAPTE EQ 
LRQEPWYHGRMSRRAAERMLRADGDFLVRDSVTNPGQYVLTGMH 
AGQPKHLLLVDPEGWRTKDVLFESISHLIDHHLQNGQPIVAAE 
S ELHLRGWSREP 


5393 


2 


982 


GGDSAGMTMETQMSQNVCPRNLWLLQPLTVLLLLASADSQAAAP 
PKAVLKLEPPWINVLQ\EDSVTLTCQGAPQP/ERSDSIQWFHNG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H»Hietidine, I«=Isoleucine, K~Lysine, 
L»Leucine, M=Methionine / N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan / Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








\NLIPTHTQPS\YRFKANNN\DSGEYTCQTGQTSL\SDPVHLTV 
LSEWLVLQTPHLEFQEGETIMLRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHLDPTFSIPQANHSHSGDYHCTGNIGYTLFSSKPVTITV 
QVPSMGSSSPMGI I VAWIATAVAAI VAAWALI YCRKKRISAN 
S TDP VKAAQFE P PGRQM I AI RKRQLE ETNND YETADGG YMTLNP 
RAPTD DDKN I YLTLPPNDH VNSNN 


5394 


2 


982 


GGDSAGMTMETQMSQNVCPRNLWLLQPLTVLLLLASADSQAAAP 
PKAVLKIiEPPWINVLQ\EDSVTLTCQGAPQP/ERSDSIQWFHNG 
\NLIPTHTQPS\YRFKANNN\DSGEYTCQTGQTSL\SDPVHLTV 
LSEWLVLQTPHLEFQEGETIMLRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHLDPTFSIPQANHSHSGDYHCTGNIGYTLFSSKPVTITV 
Q VPS MGSS S P MG 1 1 VAW I ATAVAAI VAAWAL I YCRKKR I SAN 
STDPVKAAQFEPPGRQM1AIRKRQLEETNND YETADGG YMTLNP 
RAPTDDDKNI YLTLPPNDHVNSNN 


53 95 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVPISKSTLSRSLSLQASDFDGAS 
S SGN PEAVALAPDAYSTGS S S ASS TL KRTKKPR P PS LKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLE ET PLE P AAG PKAAC PLD S ES VEGW P P ASGGGR VQNS P P VG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRSPAEPNDI P IAKGTYTFDI DKWDDPNFNPFSSTSKMQES PKL 
PQQ S YNFD PDTCDE SVDPFKTSSKTPSSPSKS PAS FEI PASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAATPETPPVISAVVHATDEEKIjAVTNQKWTCMTVDLEADKQD 
YPQPSDLSTFVNETKFSSPTEELDYRNSYEIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 
SEA IEI TAPEGS FASADALLSRLAHPVS LOGALDYLEPDLAEKN 
PPLFAQKLQREAAHPTDVSISKTALYSRIGTAEVEKPAGLLFQQ 
PDLDS ALQIARAE I ITKERE VSEWKDKYEESRREVMEMRKI VAE 

yektiaqmiedeqreksvs\hqtvqqlvlekeqa\ladlnsvek 
\ sladlfrr yekmkevlbgfrknebvlkrcaqe ylsrvkkeeqr 
yqalkvha\eekldranae\ iaqvrgkaqqeqaahqaslaerss 

CRV\DALERTLEQKNKEIEELTKICDELIAKMGKS 


5396 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVPISKSTLSRSLSLQASDFDGAS 
SSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 
KPTETP PVKETQQEPDEESLVPSGENLAS ETKTESAKTEGPS PA 
LLEETPLEPAAGPKAACPLDSES VEG WP PASGGGRVQNS P P VG 
RKTLPLTTAPEAGEVTPSDSGGQEDS PAKGHSVRLE FDYS EDKS 
S WDNQQEN P PPT KK IGKKP VAKMPLRRP KMKKT PEKLDNT PAS P 
PRSPAEPNDI P IAKGTYTFD I DKWDDPNFNPFSSTSKMQESPKL 
PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAATPETPPVISAVVHATDEEKLAVTNQKWTCMTVDLEADKQD 
YPQPSDLSTFVNETKFSSPTEELDYRNSYEIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGIAPNQESHLQVPEKSSQKELEAMGLGTP 
S E AI E I TAP EG S FAS AD ALLS RLAHP VS LCGALD YL E PDLAEKN 
P P LFAQKLQREAAHP TDVS I S KTALYSR I GTAEVEKPAGLLFQQ 
PDLDS ALQIARAEI ITKEREVSEWKDKYEESRREVMEMRKIVAE 
YE KTIAQMIEDEQREKSVS \HQTVQQLVLE KEQA\ LADLNS VEK 
\SLADLFRRYEKMKEVLEGFRKNEEVLKRCAQEYLSRVKKEEQR 
YQALKVHA\EEKLDRANAE\ IAQVRGKAQQEQAAHQASLAERSS 
CRV\DALE RTLEQKNKE IEELTKI CDELI AKMGKS 


5397 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVP I SKSTLSRSLSLQASDFDGAS " 

SSGNPEAVALAPDAYSTGSSSASSTLKRTKJCPRPPSLKKKQTTK 

KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 

LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 

RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 

S WDNQQENP P PTKK IGKKP VAKM PLRRP KM KKTPE KLDNTP AS P 

PRSPAEPNDIPIAKGTYTFDIDKWDDPNFNPFSSTSKMQESPKL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine , 
P^Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y= Tyrosine, X^Unknown, **Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PQQS YNFDPDTCDES VDPFKTS SKIPS S PSKS PAS FE I PAS AME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAATPETPPV I S AWHATDEE KLAVTNQKWTCMTVDLEADKQD 
YPQPS DL S TFVNETKFSS PTEE LDYRNS YE IE YMEKI GS S L PQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 
SEAIE ITAPEGS FAS ADALLS RLAHP VSLCGALD YLEPDLAEKN 
P PLFAQKLQREAAHPTDVS I S KTAL YS R I GTAEVE KPAGLLFQQ 
PDLDSALQ I ARAE 1 1 TKERE VSEWKDKYEESRREVMEMRK I VAE 
YEKTIAQMIEDEQREKSVS\HQTVQQLVLEKEQA\LADLNSVEK 
\ S LADL FRRYE KMKEVLEGFRKNEEVLKRCAQE YLSRVKKEEQR 
YQ ALKVHA \ EEKLDRANAE \ I AQ VRG KAQQ EQAAHQASLAERS S 
CR V\ DALE RTLEQKN KEI EELT K I CDEL IAKMGKS 


5398 


56 


5426 


SGEVCRMESNFNQEGVPRPS YVFSADP IARPSE INFDGIKLDLS 

HEFSLVAPNTEANSFESKDYLQVCLRIRPFTQSEKELESEGCVH 

I LDSQTWLKE PQ C I LGRLS E KS S G \ QM \AQKFS F FPGFLGPAT 

TQKEFFQGCIMHP\VKDLLKGQSRLIFTYGLTNSGKTYTFQGTE 

ENIRILPRTLNVLFDSLQERLYTKMNLKPHRSRBYLRLSSEQEK 

EEIASKSALLRQIKEVTVHNDSDDTLYGSIiTNSLNISEFEESIK 

D YEQANLNMANS I KFSVWVSFFE I YNEYI YDLFVPVS S KFQKRK 

MLRLSQDVKGYSFIKDLQWIQVSDS KEAYRLLKLG I KHQSVAFT 

KLNNAS SRSHS IFTVKI LQI EDS EMSR V IR VSELS L CD LAGS ER 

TM KTQNEGE RLRETGN INTS LLTLGKG I NVLKNS E KS KFQQH VP 

FRESKLTHYF/QSFFNGKGKICMIVNISQCYLAYDETLNVLKFS 

A t AQKVCVPDTLNS S QE KLFG P VKS SQDVS LDSNSNS KI LNVKR 

ATISWEN^LEDLMEDEDLVEELENAEETED/VGETKLLDEDLDK 

TLEENKAFISHEEKRKLLDLIEDLKKKLINEKKEKLTLEFKIRE 

EVTQEFTQYWAQREADFKETLLQEREILEENAERRLAIFKDLVG 

KCDTRE EAAKD I CATKVE TE E ATACLELKFNQ I KAELAKTKG EL 

I KTKEE LKKRENES DS L I QELETSNKKI I TQNQR I KEL IN 1 1 DQ 

KEDTINEFQNLKSHMENTFKCNDKADTSSLIINNKLICNETVEV 

PKDSKSKICSERKRVNENELQQDEPPAKKGSIHVSSAITEDQKK 

SEEVRPNIABIEDIRVLQENNEGLRAFLLTIENELKNEKEEKAE 

LNKQIVHFQQELSLSEKKNLTLSKEVQQIQSNYDIAIAELHVQK 

SKNQEQEEKIMKLSNEIETATRSITNNVSQIKLMHTKIDELRTL 

DSVSQISNIDLLNLRDLSNGSBEDNLPNTQLDLLGNDYLVSKQV 

KEYRIQEPNRENSFHSSIEAIWEECKEIVKASSKKSHQIEELEQ 

Q I EKLQAE VKG YKDENNRLKEKEHKNQDDLLKE KETL I QQLKE E 

LQE KNVTLD VQ I QHWEGKRALS E LTQG VTC YKAKI KELET I LE 

TQKVERSHSAKLEQDILEKESIILKLERNLKEFQEHLQDSVKNT 

KDLNVKELKLKEE I TQLTNNLQDMKHLLQLKE E EEETNRQETE K 

LKEELSASSARTQN\LNADLQRKEEDYADLKEKLTDAKKQIKQV 

QKEVSVMRDEDKLLRIKINELEKKKNQCSQELDMKQR\TIQQLK 

EQLINQKVEEAIQQYERACKDLNVKEKIIEDMRMTLEEQEQTQV 

EQDQVL\EAKLEEVERIiATEXiDRWRVKCNI)LETKNNQRSNKEHE 

NNTDVLGKLTNLQDELQES EQKYNADRKKWLE E KMML I TQAKEA 

ENIRNKEMKKYAEDRERFFKQQNEMEILTAQLTEKDSDLQKWRE 

ERDQLVAALEIQLKALISSNVQKDNEIEQLKRIISETSKIETQI 

MDI KPKRI SSADPDKLQTEPLSTSFE I SRNKIEDGSWLDSCEV 

STENDQSTRFPKPELEIQFTPLQPNKMAVKHPGCTTPVTVKIPK 

ARKRKSNEMEEDLVKCEWKKNATPRTNLKFPISDDRNSSVKKEQ 

PSILQSKAKKIIETMSSSKLSNVEASKENVSQPKRAKRKLYTSE 
ISSPIDISGQVILMDQKMKESDHQI IKRRLRTKTAK 


5399 


705 


230 


GPRMAKFLSQDQINEYKECFSLYDKQQRGKIKATDLMVAMRCLG 
ASPTPGEVQRHLQTHGIDGNGELDFSTFLTIMHMQIKQEDPKKE 
ILLAMLMVDKEKKGYVMASDLRSKLTSLGEKLTHKEV\ DDLFRE 
\ADIE PNGKVKYDEFIHKI TS YLDGT Y 


5400 


931 


248 


SHCSSGME I P PTN YPAS RAALVAQN Y I N YQQGT PHRVF E VQKVK 
QASMEDtPGRGHKYRLKFAVEEIIQKQVKVNCTAEVLYPSTGQE 
TAPEVNFTFEGETGKNPDEEDNTFYQRLKSMKEPLEAQNI\PDN 



312 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Pt*p<*I i f* f- fa ortH 
ricuxuLcu cliu 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


&mino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, £= 
Glutamic Acid, ^Phenyl alanine, G=Glycine, 

n-nxoLXUXllCf 1-AoUlCUUl tie , A.=5jjySine, 

L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








FGNVS PEMTLVLHLAWVACG Y 1 1 WQNSTEDTWYKMVKI QTVKQV 

QRNDDF I ELDYTI LLHNIASQE I I PWQMQ VLWHPQYGTKVKHNS 
RLPKEVQLE 


5401 


3 


1360 


TGWSYGPTTSLAPLAPRDFPFPPKLLIHPQAWRLSCGAGSMGS 
yAA/vt. WKJN has Wotibi) SIjouCSMGCF KDDRIVFWTWMFSTYFME 
KWAPRQDDMLFYVRRKLAYSGS E SGADGRKAAEPEVEVEVYRRD 
S KKLPGLGDPDIDWEES VCLNLI LQKLDYMVTCAVCTRADGGDI 
HIHKKKSQQVFASPSKHPMDSKGEESKISYPNIFFMIDSF\EE\ 
VFSDMTVGKGEMVCVELVASDKTNTFQGVIFQGSIRYEALKKVY 
DNRVSVAARMAQK\MSFGFSKYSNMEF\VR\MKGPQGKGHAEMA 
VSRVSTGDTS PCGTEEDSSPAS PMHERVTSFSTPPTPERNNRPA 
FFSPSLKRKVPRNRIAEMKKSHSANDSEEFFREDDGGADLHNAT 
NLRSRSLSGTGRSLVGSWLKLNRADGNFLLYAHLTYVTLPLHRI 
LTD I LEVRQKP I L MT 


5402 


3445 


1563 


GE CFI MAA WQQNDLVFE FASNVMEDER QLGDPAI FPAVI VEHV ' 

PGADILNS YAGLACVEEPNDMI TBS SLDVAEEE I IDDDDDDITL 

TVEASCHDGDETIETIEAAEALLNMDSPGPMLDEKRINNNIFSS 

PEDDMWAPVTHVSVTLDGIPEVMETQQVQEKYADSPGASSPEQ 

PKRKKGRKTKPPRPDSPATTPNISVKKKNKDGKGNTIYLWEFLL 

ALLQDKATCP KYI KWTQREKGI FKL VDS KP VSRLWRKHKNKP \D 

MNYEPMGRALRYYYQRGILAKVEGQRLVYQFKEMPKDLIYINDE 

DPSSSIESSDPSLSSSATSNRNOTSRSRVSSSPGVKGGATTVLK 

PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSPYPTQLFRTVHVVQ 

PVQAVPEGEAARTSTMQDETLNSSVQSIR\TIQAPTQVPVWSP 

RNQQ \ LH T VTLQT VPLTT VI AS TDPS AGTGSQ KF I LQAI P S SQ P 

MTVLKENVMLQSQKAGSPPSIVLGPARV\QQVLTSNVQTICNGT 

VSV\ASSPSFS\ATAPWTLFLIiGSSQLVAHPPGTVITSVIKTQ 

ETKTLTQEVEKKESEDHLKENTEKTEQQPQPYVMWSSSNGFTS 

QVAMKQNELLEPNSF 


5403 


3445 


1563 


GECF I MAAWQQNDLVFEFASNVMEDERQLGDPAI FPAVI VEHV 
PGAD I LNS YAG LACVE EPNDM I TES SLDVAEEE I IDDDDDDITL 
TVEAS CHDGDETIETIEAAEALLNMDS PGPMIiDEKR INNNI FSS 
PEDDMWAPVTHVS VTLDG I PEVMETQQVQE KYADS PGASS PEQ 
PKRKKGRKTKPPRPDSPATTPNISVKKKNKDGKGNTIYLWEFLL 
ALrLQDKATCP KY I KWTQREKG I FKL VDS KP VSRLWRKHKNKP \D 
MNYEPMGRALRYYYQRGILAKVEGQRLVYQFKEMPKDLIYINDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSPYPTQLFRTVHVVQ 
PVQAVPEGEAARTS TMQDETLNSSVQS I R\TIQAPTQVPVWS P 
RNQQ\LHTVTLQTVPLTTVIASTDPSAGTGSQKFILQAIPSSQP 
MTVLKENVMLQSQKAGSPPSIVLGPARV\QQVLTSNVQTICNGT 

ETKT LTQEVEKKES EDHLKENTE KTEQQ P QP YVMWS S SNG FTS 
QVAMKQNELLEPNSF 


5404 


187 


1111 


LP VTLI FAKMKTLQSTLLLLLLVPLI KPAPPTQQDS RII YDYGT 
DNFEBS IFSQDYEDKYLDGKNI KEKETVI IPNEKSLQLQKDEAI 
TPLPPKKENDEMPTCLLCVCLSGSVYCEEVDIDAVPPLPKESAY 
LYARFNKIKKLT \ AKDFADI PNLRRLDFTGNLIEDI EDGTFSKL 
SLVEELSLAENOLLKLPVLPPKLTLFNAKYNKIKSRGIKANAFK 
IUjJNINJj l B hi IjDnrJAJjUib Vi^bWljriiSIjRVIHLQrwNlASITDDTF 
CKANDTSYIRDRIEEIRLEGNPIVLGKHPNSFICLKRLPIGSYF 


5405 


2199 


1220 


QN SRS LHMDPQNQHGS GSS L W I QQPS LDS R PRLD YERE I QPTA 
I LSLDQ I KAI RGSNE YTEGPS WKRPAPRTAPRQEKHERTHE 1 1 
P INVNNNYEHRHTSHLGHAVLPSNARGP ILSRS TS TGS AASSGS 
NS SASSEQGLLGRS PPTRPVPGHRSERAIRTQPKQLI VDDLKGS 
LKEDLTQHKFICEQCX3KCKCGECTAPRTLPSCLACNRQCLCSAE 
SMVEYGTCMCL\VKGIFYHCSNDDEGDSYSDNPCSCSQSHCCSR 
YLCMGAMS L FL PCLLC YP PAKGCLKLCRRC YDW IHRPGCRCKNS 
NTVYCKLESCPSRGQGKPS 


5406 


279 


2732 | 


R WRT YNVEQ PLTFMD VAI E FCLE E WQCLDTAQQNL YRNVMLENY 
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ID 

NO: 


irreaicteci 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, c=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G»Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M«Methionine, N^Asparagine , 
P«Proline, Q=Glutamine, R^Arginine, 
SsSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RNLVFLG/ I IAVSKPDLITCLEQEKEPWEPMRRHEMVAKPPVMC 
SHFTQDFWPEQHIKDPFQKATLRRYKNCEHKNVHLKKDHKSVDE 
CKVHRGGYNGFNQCLPATQS K I FLFDKCVKAFHKFSNSNRHKIS 
HTEKKLFKCKE CGKS FCMLSHLAQHK I IHTR VNFCKCE KCGKAF 
NCPSI ITKHKRINTGEKPYTCEECGKVFNWSSRLTTHKKNYTRY 
KLYKCEECGKAFNKSSILTTHKIIRTGEKFYKCKECAKAFNQSS 
NLTEH KK I H PGEKP YKCEECGKAFNWP S TLTKHKRIHTGEKP YT 
CEECGKAFNQFSMLTTHKRIHTA\EKFYKCTECGEAFSRS\SNL 
TKHKE I HTE KKP YKCEECG KAF KWS S KLTEHKLTHTGE KP YKCE 
KCGKAFNCPS I ITKHNRINTGEKPYTCEECGKVFNWSSRLTTHK 
KNYTR YKL YKCE E CG KAFNKS S I LTTH KKI H I E KK FYKCE ECGK 
AFKWSSKLTEHKITHTGEKPYKCEECGKAFNHFSIIjTKHKRIHT 
GEKPYKCEECGKAFTQSSNLTTHKKIHTGEKFYKCEECGKAFTQ 
SSNLTTHKKI HTGGKP YXCEECGKAFNQFS TfcTKHKI IHTEEKP 
YKCEE CGKAFKWS STLTKHKI I HTGEKP YKCEE CG \ KAFKLS ST 
LSTHK 1 1 HTG E KP YKCE KCG KAFNR P SNL I EH KK I HTGEQ P YKC 
EECGKAFNYS SHLNTHKR I HTKEQP YKCKECGKAFNQYSNLTTH 
NKIHTGEKL YKPEDVTVILTTPQTFSN I K 


5407 


3 


659 


RPRRRQSSCCTGWLAGWLLRAAPRFCRRTETDMEQGKGLAVLIL 
AIILLQGTLAQS I KGNHLVKVYDYQEDGSVLLTCDAEAKNITWF 
KDGKMIGFLTEDKKKWNLGSNAKDPRGMYQCKGSQNKSKPLQVY 
YRMCQNCI ELNAATI SGFLFAE IVS I FDLAVGVYF I AGTGMEFR 
QS \ RAS DKQTLL P \NDPAPTQP LKD PRKMTQ YSHLQGN \ QLRRN 


5408 


2745 


6128 


QGSKGTCHPQAQQPWDEGVWQEAPSQSEPWGQSQEPPTMPQRLP 
HARQHTPLPLGSADYRRWSVRPQGPHRDPKDSRDAAKREQGSL 
APRPVPASRGGKTLCKGYRQAPPGPPAQFQRPICSASPPWASRF 
STPCPGGAVREDTYPVGTQGVPSLALAQGGPQGSWRFLEWKSMP 
RLPTDLDIGGPWFPHYDFERSCWVRAISQEDQLATCWQAEHCGE 
VRNKDMSWPEEMSFIANSSKIDRHKVPTEKGATGLSMLGNTCFM 
NSSIQCVSNTQPLTQYFISGRH LYE LNRTNP IGM KGHMAKC YGD 
LVQELWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELIiAFL 
LDGLHEDLNRVHEKPYVELKDSDGRPDWEVAAEAWDNHLRRNRS 
I WDL FHGQLRS Q VKCKTCGH I S VR FD PFN FLS L PLP MDS YMHL 
EITVIKLDGTTPVRYGLRLNMDEKYTGLKKQLSDLCGLNSEQIL 
LAEVHGSNI KNFPQDNQKVRLS VSG FLCAFEI PVPVS PISASS P 
TQTDFSSS PSTNEMFTLTTNGDLPRPI FI PNGMPNTWPCGTEK 
NFTNGMVNGHMPSLPDSPFTGYIIAVHRKMMRTELYFLSSQKNR 
PS LFGMPL 1 VPCTVHTRKKDL YDAVW I Q VS RLAS PL P P QEAS NH 
AQDCDDSMGYQYPFTLRVVQKDGNSCAWCPWYRFCRGCKIDCGE 
DRAFIGNAYIAVDWHPTALHLRYQTSQERWDEHESVEQSRRAQ 
VE PINLDS CLRAFTS EEELGENEM YYCS KCKTHCLATKKLDLWR 
LPPILIIHLKRFQFVNGRWIKSQKIVKFPRESFDPSAFLVPRDP 
ALCQHKPLTPQGDELSEPRILAREVKKVDAQSSAGEEDVLLSKS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQ IGS KNKLS S S KENLDASKENGAGQ I CELADALS RGH 
VLGGS QPE LVTPQDHEVALANG FL YEH EACGNGCGNGYSNGQLG 
NHS EE DS TDDQREDTR I KP I YN LYAI S CHSG I LGGGH YVT YAKN 
PNCKW YCYNDSSCKELHPDE IDTDSAY I LFYEQQGI DYAQFL PK 
TDGKKMADTS SMDED FE SD Y \ E KYCVLQ 


5409 


2745 


6128 


QGSKGTCHPOAQQPWDEGVWQEA>5fiSE'i>WG0SQEP^TMPQRLP 
HARQHTPLPLGSADYRRWSVRPQGPHRDPKDSRDAAKREQGSL 

APT5 P VP AS T? CiCiK TT .C VfiVDn 21 D "Dfl D d an POD "D t r> C 3V c n cu »v o rs n 

STPCP^SGAVREDTYPVGTQGVPSLALAQGGPQGSWRFLEWKSMP 
RLPTDLD IGG PWFPHYDFERS CWVRAISQEDQI*ATCWQAEHCGE 
VRNKDMSWPEEMSFIANSSKIDRHKVPTEKGATGLSNLGNTCFM 
NSSIQCVSNTQPLTQYFISGRHLYELNRTNPIGMKGHMAKCYGD 
LVQELWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAFL 
LDGLHEI3LNRVHEKPYVELKDSIX3RPDWEVAAEAWDNHLRRNRS 
IWDLFKGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHL 
E I TV I KLDGTTPVR YGLRLNMDEK YTGLKKQLS DLCGLNS EQI L 
tiAE VHGSNI KNFPQDNQKVRLS VSGFLCAFE I PVPVS P I S ASS P 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F« Phenyl alanine, G-Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L«Leucine, M«Methionine , N»Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




• 




TQTD F S S S PS TNEMFTLTTNGDLPRP I F I PNGM PNTWP CGTEK "' 
NFTNGMVNGHMPSLPDS PFTGYI IAVHRKMMRTELYFLSSQKNR 
PSLFGMPLIVPCTVHTRKKDLYDAVWIQVSRLASPLPPQEASNH 
AQDCDDSMGYQYPFTLRWQKDGNSCAWCPWYRFCRGCKIDCGE 
DRAFIGNAYIAVDWHPTALHLRYQTSQERWuEHESVEQSRRAQ 
VEPINLDSCLRAFTSEEELGENEMYYCSKCKTHCLATKKLDLWR 
L P P I L 1 1 HLKR FQ FVNGR W I KSQKI VKFPRE S FD PSAFL VPRDP 
ALCQHKPLTPQGDELSEPRILAREVKKVDAQSSAGEEDVLLSKS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQIGSKNKLSS S KENLDAS KENGAGQI CELADALSRGH 
VLGGS QPELVTPQDHE VALANGFL YEHEACGNG CGNGYSNGQLG 
NHSEEDSTDDQREDTRI KP I YNLYAI S CHSG I LGGGHYVT YAKN 
PNCKW YC YNDSS CKELHPDE I DTDSAY I LFYEQQGI DYAQFLPK 
TDGKKMADTS SMDEDFESDY \EKYCVLQ 


5410 


2 


710 


LRFPGQARHVWLAARMQAPHKEHL YKLLVIGDLGVGKTS 1 1 KR Y""~ 
VHQNF S SH YRAT I G VD FALKVLHWD P ETWRLQLWD I AG QERFG 
NMTRVY YREAMGAF I VFDVTRP AT FEAVAKWKNDLD S KL S L PNG 
KPVSWLLANKCDQGKDVLMNNGLKMDQFCKEHGFVGWFETSAK 
ENINIDEASRCLVKHILANECDIiMESIEPDWKPHLTSTKVASC 
S G\ CAK I L VGTFAGVW 


5411 


1302 


289 


TGPAAAGRRKALGS FGKPS PVTGLRAARRRRTRPSAPAAPS VGC 
GKRRESDAGAGGERASVRTGSGRRGGRTMAGDSEQTLQNHQQPN 
GGEPFLIGVSGGTASGKSSVCAKIVQLLGQNEVDYRQKQWIIiS 
QDS FYRVLTSEQKAKAL KGQFNFDH PDAFDNEL I LKTL KE I TEG 
KTVQI PVYDFVSHSRKEETVTVYPADVVXjFEGIIiAFYSQER/ IR 
DIiFQMKL FVDTDADTRLSRRVLKD I S E RGRDLEQ I LS S STLR F V 
KPA\FEEFCLPPK\KYADVIIPR\GADN\RVPINLIVQHIQ\DI 
LNGGPS\NRQTNGCLNGYTPSRKRQASESSSRPH 


5412 


3180 


313 


QGISNFFHKEANF WFEVSG YL I S PLRS P FVDPALE WS LMAS P WN 
KMEGESS RFE IHTP VSDKKKKKCS IHKER PQKHSHE I FRDS S L V 
NEQSQ I TRRKKRKKDFQHLI S S PLKKSR I CDETANATSTLKKRK 
KRRYS ALEVDEEAGVTWLVD KEN INNT P KHFRKDVD WC VDMS 
I EQKLPRK\ PKTDKFQVLAKSH \ AHKSEALHSKVREKKNKKHQR 
KAAS WES QRA\RDTLPQSE FPTQEES WLS VGPGGE I TELP \AS A 
HKNKSKKKKKKSSNREYET\LAMPEGSQAGREAGTDMQESQPTV 
GLDDETP QLLG PTHKKKS KKKKKKKSNHQEFES LAMPEGSQ VG S 
EVGADMQES \RPAVGLHGETAG I PAPAYKNKSKKKKKKSNHQE F 
EAVAM PES LES AYP EG S Q VGS E VGTVEG.S TALKGFKESNSTKKK 
SKKRKLTSVKRARVSGDDFSVPSKNSBSTLFDSVEGDGAMMEEG 
VKS RPRQ KKTQACLAS KHVQE APRLE PANE EHNVE TAEDSE I R Y 
LS ADSGDADDSDADLGSAVKQLQEFI PNI KDRATST1 KRMYRDD 
LER FKE FKAQGVAI K FG KFS VKENKQLE KNVEDFLALTG IES AD 
KLL YTDR YPE EKS VI TNLKRR YS FRLHIG \ RNI AR P WKLI YYRA 
KKMFDVNNYKGRYSEGDTEKLKI4YHSLLGNDWKTIGEMVARRSL 
SVALKFSQISSQRNRGAWSKSETRKLIKAVEEVILKKM3PQELK 
EVDSKLQENPESCLS I VREKL YKG ISWVE VEAKVQTRNWMQCKS 
KWTEILTKRMTNGRR I YYGMNALRAKVS L I ERL YE INVEDTNE I 
DWEDLASAIGDVPPSYVQTKFSRLKAVYVPFWQKKTFPEIIDYL 
YETTLPLLKEKLEKMMEKKGTKIQTPAAPKQVFPFRDIFYYEDD 
SEGGGHRKRKRRPRRHAWFTPVI PVLWEAKAGWII 


5413 


3753 


1304 


RFPAGVAPRRAMANVS KKVS WSGRDRDDEEAAPLLRRTARPGGG 
TPIjLNGAGPGAARQSPRSALFRVGHMSSVKIiDDELLEP\DMDPP 
HP F PKE I PHNE KLLS LKYESLD YDNS ENQLFLE EERR I NHTAFR 
TVEIKRWVICALIGILTGLVACFIDIWENLAGLKYRVIKGNID 
KFTEKGGLS FSLLLWATLNAAFVLVGSVI VAF I EPVAAGSGI PQ 
IKCFLNGVKIPHWRLKTLVIKVSGVILSWGGLAVGKEGPMIH 
SGSVIAAGISQGRSTSLKRDFKIFEYLRRDTEKRDFVSAGAAAG 
VSAAFGAPVGGVLFSLEEGASFWNQFLTWRIFFASMISTFTLNF 
VLSIYHGNMWDLSSPGLINFGRFDSEKMAYTIHEIPVFIAMGW 
GGVLGAVFNALNYWLTMFR IR YI HRPCLQV I EAVLVAAVTATVA 
FVLIYSSRDCQPLQGGSMSYPLQLFCADGEYNSMAAAFFNTPEK 



315 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl al anine , G-Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, MoMethionine, N-Asparagine , 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=>Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








SWSLFHDPPGSYNPLTLGLFTLVYFFLACWTYGLTVSAGVFIP~ 
SLLIGAAWGRLFGISLSYLTGAAIWADPGKYALMGAAAQLGGIV 
RMTLS LTVI MMEATSNVT YGFP IML VLMTAK I VGDVF I EGL YDM 
HIQLQSVPFLHWEAPVTSHSLTAREVMSTPVTCLRRREKVGVIV 
DVLSDTASNHNGFPWEHADDTQPARLQGLILRSQLIVLLKHKV 
FVERSNLGLVQRRLRLKDFRDAYPRFPPIQSIHVSQDERECTMD 
LS EFMNPSP YTVPQEAS LPRVFKLFRALGLRHLVWDNRNQWG 
LVTRKDLARYRLGKRGLEELSLAQT 


5414 


2130 


390 


GVASAWDRALFS PLLSPTSRVFRTS PPRCVSTETGRRDRARVPS 
QWCS VLQGKL P VSGRTS LACVRS I L LS PAS S PRKVG I VGGTGAR 
AGAAPRDHGRVRHRRPS SARRMTRTTGQCLAPRGCQGPRGTRS P 
RS PRS RTRRG CS AS P ACLP / CRS AL I VAVLC Y INLLNYMDRFTV 
AG VLPDI EQ FFN I GDS S SGLI QT VF I S S YM VLAP VFG YLGDR YN 
RKYLMCGGIAFWS L VTLGS S Fl PGEHFWLLLLTRGLVGVGEAS Y 
STIAPTLIADLFVADQRSRMLS I FYFAIPVGSGLGYIAGSKVKD 
MAGDWHWALRVTPGLG WAVLLLFL WRE PPRGAVERHS DL P PL 
NPTSWWADLRALARNPSFVLSSLGFTAVAFVTGSLALWAPAFLL 
RSRWLGETP PCLPGDS CS S SDSL I FGLITCLTGVLGVGLGVE I 
SRRLRHSNPRADPLVCATGLLGSAPFLFLSLACARGS IVATYIF 
IFIGETLLSMNWAIVADILLYWIPTRRSTAEAFQIVLSHLLGD 
AGSPYLIGLISDRLRRNWPPSFLSEFRALQFSLMLCAFVGALGG 
AAFLGTAHLH 


5415 


693 


2986 


IPPKTKLELQKH\LTTLT\NQEQATIFEEVQKLRPRNEQRENEL 
IISFLRCLFEBKQKEHIHIGEMKQTSQMAAENIGSELPPSATRF 
RLDMLKNKAKRS LTESLES I LSRGNKARGLQEHS I S VDLDSSLS 
STLSNTSKEPS VCEKEALP ISESS FKLLGSS EDLSSDSESHLPE 
E PAP LS PQQAFRRRANTLSH FP I E CQE PPQPARGS PG VSQRKLM 
RYH S VSTETPHE RKD FES KANHLGDSGGTP VKTRRHS WRQQ I FL 
RVATPQKACDSSSRYEDYSELGELPPRSPLEPVCEDGPFGPPPE 
EKKRTS RELREL WQKAI LQQ I LLLRME KENQKLQ AS ENDLLNKR 
LKLDYEEITPCLKEVTTVWEKMLSTPGRSKIKFDMEKMHSAVGQ 
GVP\RHHRGEIWKFLAEQFHLKHQFPSKQQPKDVPYKELLKQLT 
SQQHAILIDLGRTFPTHPYFSAQLGAGQLSLYNILKAYSLLDQE 
VGYCQGLSFVAGILLLHMSEEEAFKMLKFLMFDMGLRKQYRPDM 
1 1 LQ I QM YQLSRLLHD YHRDLYNHLEEH E IGP S L YAAPWFLTMF 
ASQFPLGFVARVFDMIFLQGTEVIFKVALSLLGSHKPLILQHEN 
LETI VDF I KS TL PNLGLVQMEKTINQ VFEMDIAKQLQA YEVE YH 
VLQEELIDSSPLSDNQRMDKLEKTNSSLRKQjNLDLLEQLQVANG 
RIQSLEATIEKLLSSESKLKQAMLTLELERSALLQTVEELRRRS 
AKPSDREPECTQPEPTGD 


5416 


27 


4074 


KSQLFCFWGGKAGDILSGDQDKEQKDPYFVETPYGYQLDLDFLK " 
YVDD IQ KGNT I KRLN I QKRRKP S VPC PE PRTTSGQQG I WTSTE S 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSIjPFLTIP 
ENRQLPPPS PQLP ECHNLHVTKTLMETRRRLEQERATMQMTPGE F 
RRPRLAS FGGMGTTS S LPS FVGSGNHNPAKHQLQNG YQGNGD YG 
SYAPAAPTTSSMGSSIRHSPLSSGISTPVTNVSPMHLQHIREQM 
AIALKRLKELEEQVRTI PVLQVKI SVLQEEKRQLVSQLKNQRAA 
S Q I NVCGVRKRS YS AGNASQLEQLSRARRSGGEL Y I DYEE EEM E 
TVEQSTQRI KE FRQL\TADMQALEQKI QDSS CEAS SELRENGE C 
RS VAVGAEENMND I WYHRGSRS CKDAAVGTLVEMRNCGVSVTE 
AMLG VMTE ADKE I E LQQQT IES LKEKI YRLEVQLRETTHDREMT 
KLKQELQAAGS RKKVDKATMAQPLVFS KWEAWQTRDQMVGS H 
MDL VDTC VG TS VETNS VG I SCQ PECKNKWG PELPMNWW I VKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVNDLTLLKT 
NLNLKEVRS IGCGDCSVDVTVCS PKE CAS RGVNTEAVS QVE AAV 
MAVPRTADQDTSTDLEQVHQFTNTETATL I ES CTNTCLS TLDKQ 
TS1X3TVETRTVAVGEGRVKDINS S TKTRS IG VGTLLSGHSGFDR 
PSAVKTKES GVGQININDNYLVGLKMRT I ACGPPQLTVGLTAS R 
RS VGVGDDPVGESLENPQ PQAPLGMMTGLDHY I ER I QKLLAEQQ 
TLLAENYSELAEAFGEPHSQMGSLNSQLISTLSS1NSVMKSAST 
EELRNPDFQKTSLGKITGSYLGYTCKCGGLQSGSPLSSQTSQPE 
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residue of 
amino acid 
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Predicted end 
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corresponding 
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residue of 
amino acid 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N~Asparagine, 
P«Proline, Q*=Glut amine, R»Arginine, 
S*Serine, ^Threonine, V=Valine, 
W= Tryptophan, Y= Tyrosine, X=Uhknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QEVGTSEG KP ISS LDAFPTQEGTLS PVNLTDDQIAAGL YACTNN 
ESTLKS IMKKKDGNKDSNGAKKNLQFVGINGGYETTSSDDSS SD 
ESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAEGHHAVNIEGL 
KSARVEDEMQVQECEPEKVEIRERYELSEKMLSACNLLKNTIND 
PKALTSKDMR FCLNTLQHEWFRVSSQKSA I PAMVGD YIAAFEA I 
SPDVLRYVXNLADGNGNTALHYSVSHSNFEIVKLLLDADVCNVD 
HQNKAGYTP I MLAALAAVEAE KDMR IVEELFGCGD VNAKASQAG 
QTALMLAVSHGRIDMVKGLLACGADVNIQDDEGSTALMCASEHG 
HVE IVKLLLAQPGCNGHLEDNDGSTALS I ALEAGHKDIAVLLYA 
HVNFAKAQS PGTPRLGRKTS PGPTHRGS FD 


5417 


27 


4074 


KSQLFCFWGGKAGDILSGDQDKEQKDPYFVETPYGYQLDLDFLK 
YVDD IQKGNT I KRLNI QKRRKP S VPCPE PRTTSGQQG I WTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLPPPSPQLPKHNLHVTKTLMETRRRLEQERATMQMTPGEF 
RRPRLAS FGGMGTTSS LPS FVGSGNHNPAKHQLQNG YQGNGDYG 
SYAPAAPTTSSMGSS I RHSPLSSGI STPVTNVS PMHLQHIREQM 
A IALKRLKELEEQVRTI PVLQVK IS VLQEEKRQLVSQLKNQRAA 
SQINVCGVRKRSYSAGNASQLEQLSRARRSGGELYIDYEEEEME 
TVEQSTQRIKEFRQL\TADMQALEQKIQDSSCEASSEIiRENGEC 
RS VAVGAEENMNDI VVYHRGSRS CKDAAVGTL VEMRNCG VS VTE 
AMLGVMTEADKE I ELQQQTI ESLKEKI YRLEVQLRETTHDREMT 
KLKQELQAAG SRKKVD KATMAQ PLVFS KWEAWQTRDQPWGSH 
MDL VDTC VGTS VETNS VG I S CQPECKNKWG PELPMNWW I VKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVNDLTLLKT 
NLNLKEVRS IGCGDCS VDVTVCS PKE CASRGVNTEAVSQVEAAV 
MAVPRTADQDTSTDLEQVHQFTNTETATLIESCTNTCLSTLDKQ 
TS TQT VETRTVAVGEGRVKD INS S T KTRS IG VGTLLS GHSG FDR 
P S AVKTKESGVGQIN I NDN YLVGLKM RTI ACGP PQLT VGLTASR 
RS VG VGDDPVGES LENPQPQAP LGMMTGLDH YI ERIQKLLAEQQ 
TLLAEKTYSELAEAFGEPHSQMGSLNSQLISTLSSINSVMKSAST 
EELRWPDFQKTSLGKITGSYLGYTCKCGGLQSGSPLSSQTSQPE 
QBVGTSEGKP IS SLDAFPTQEGTLS PVNLTDDQIAAGL YACTNN 
ESTLKSIMKKKDGNKDSNGAKKNLQFVGINGGYETTSSDDSSSD 
ES S S SESDDE CDV I E YP LE EEEE E EDEDTRGMAEGHHAVNI EGL 
KSAR VEDEMQ VQE CEPE KVE IRER YELS E KMLS ACNLLKNTIND 
PKALTSKDMRFCLNTLQHEWFRVSSQKSAIPAMVGDYIAAFEAI 
S PDVLRYV I NLADGNGNT ALHY S VS HSNFE I VKLLLDAD VCNVD 
HQNKAGYTP I MLAALAAVEAEKDMR I VEEL FGCGD VNAKAS QAG 
QTALMLAVSHGR I DMVKG LLACGADVNI QDDEGSTALMCAS EHG 
HVEIVKLLLAQPGCNGHLEDNDGSTALSIALEAGHKDIAVLLYA 
HVNFAKAQS PGTPRLGRKTS PGPTHRGS FD 


5418 


24 


1133 


S VPRAGGDMETGAAELYDQALLG I LQHVGNVQD FLRVLFGFLYR 
KTDFYRLLRHPSDRMGFPPGAAQALVLQVFKTFDHMARQDDEKR 
RQELEEKIRRKEEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTEL 
DGHQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGAAEVPR\EPPI 
LPRI QEQFQKNPDS YNGAVRENYTWSQDYTDLEVRVPVPKHWK 
GKQVSVALSSSSIRVAMLEENGERVLMEGKLTHKINTESSLWSL 
EPGKCVLVNLS KVGEYWWNAI LEGE E P IDIDKINKERSMAT VDE 
EEQAVLDRLT FD YHQKLQGKPQSHE L KVHEMLKKGWDAEGS P FR 
GQR FD PAMFN I S PGAVQ F 


5419 


1395 


259 


GTHPLDPDLVSRTSVQGPLMTMACPGMSDTEESPFLGPRAAEEG 

mJQPJXCEHPfiPP VQ1?I?1?<*2dd cn r TQr»i?/ s, OGDViJxr\rKrMVtTTin»r>7vTt7v 
a CtO Eitti* cu\e varn rtivo ■& QJlljKKo U X obfuKo Kl\xl is. V JN W IvriiriiKAlJA 

KDPASLPQC/LGP/DCVRPAQPSSKYCSDDCGMKLAANRIYEIL 
PQRIQQ WQQS PC IAEEHGKKLLER IRREQQSARTRLQEMERRFH 
E LEAI I LRAKQQAVREDE ESNEGDS DDTDLQI FCVS CGHP INPR 
VALRHMERCYAK YESQTS FGSM YPTR I EGATRLFCDVYNPQS KT 
YCKRLQVLCPEHSRDPKVPADEVCGCPLVRDVFELTGDFCRLPK 
RQCNRHYCWEKIjPJIAEVlJLERVRVWYICLDELFEQERNVRTAMTN 
RAGLLAIiMLHQT I QHDPLTTDLRSSADR 


™542u" 


117 


1733 


NE AGGACP FKGG ASGRLYLS PRLPRVS VAGCEERPLGWVWVLGG 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHBRIR 
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Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=lsoleucine, K*=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S*=Serine, T»Threonine, V=Valine, 
W=Tryptophan, Yj= Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECUS T LL FATL YI LCH I FLTRF KKPAEFTT \ GMMKM P P S TRL / 
LLELCTFTLAI ALGAVLLLPFS 1 1 SNBVLLSLPRNY Y I QWLNGS 
L IHGLWNLVFLFSNLSLI FLMPFAYFFTES EG FAGSRKG VLGRV 
YBTVVMliMIJjTLLVLGMVWVASAIVDKNKANRESLYDFWEYYLP 
YLYSCISFLGVLLLLVCTPLGLARMFSVTGKLLVKPRLLEDLEE 
QLYCSAFEEAALTRRI C^TPTSCWLPLDMELLHRQVLiALQTQRVL 
LEKRRKASAWQRNLGYPIiAMLCLLVLTGLS VL I VAIH I LELLID 
EAAMPRGMQGTSLGQVSFS KLGSFGAVIQWLI F YLMVSS WGF 
YSSPLFRSLRPRWHDTAMTQIIGNCVCLLVLSSALPVFSRTLGL 
TRFDLLGDFGRFNWLGNFYIVFLYNAAPAGLTTLCLVKTFTAAV 
RAELIRAFGERE 


5421 


117 


1733 


NEAGGACPFKGGASGRLYLSPRLPRVSVAGCEERPLGWVWVLGG 
GGFL PARP P RAQRHLGFSHAE QSMEAPD YEVLS VREQL FHE R I R 
ECUS TLLFATL Y I LCH I FLTRFKKP AE FTT\GMMKMPPSTRL / 
LLE LCT FTLA I ALG AVLLL P FS 1 1 S NEVLLSLPRNYY I QWLNGS 
L IHGL WNLVFLFSNLS L I FLMPFAYFFTES BG FAGS RKG VLGRV 
YETWMLMLLTLLVLGMVWVASAIVDKNKANRESLYDFWEYYLP 
YLYSC I SFLGVLLLLVCTPLGLARMFSVTGKLLVKPRLLEDLEE 
QLYCSAFEEAALTRRICNPTSCWLPLDMELLHRQVXiALQTQRVL 
LEKRRKASAWQRNLGYPLAMLCLLVLTGLSVLIVAIHILELLID 
EAAMPRGMQGTS LGQVS FS KLGSFGAVIQWLI FYLMVSS WGF 
YSSPLFRSLRPRWHDTAMTQIIGNCVCLLVLSSALPVFSRTLGL 
TRFDLLGDFGRFNWLGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 
RAELIRAFGERE 


5422 


3 


1263 


SCGESLPTWLAGASRPGIGRKGGAWGGRGGSSPAQVLLSPGPVF 
KAGCNWWH LS RDQAG VQRCDLGS SQPPPLGFKRFS CLSLPS S WD 
YRSTVLCVSKMEADLSGFNIDAPRWDQRTFLGRVKHFLNITDPR 
TVFVSERELDWAKVMVEKSRMGWPPGTQVEQLLYAKKLYDSAF 
HPDTG E KMNV I GRM S FQLPGGM 1 1 TG FMLQF YRTMP AV I FWQ WV 
NQS FNAL VNYTNRNAAS PTS VRQMALS Y FTATTTAVATAVGMNM 
LTKKAP PLVGRW V P FAAVAAANCVNI PMMRQQELI KGICVKDRN 
ENEIGHSRRAAAIGITQWISRITMSAPGMILLPVIMERLEKLH 
FMQKVKVL/SAPLQVMLSGCFLIFMVPVACGLFPQKCELPVSYL 
EPKLQDTIKAKYGELEPYVYFNKGL 


5423 


3186 


905 


GVSMALGEEKAEAEASEDTKAQSYGRGS CRERELDI PGPMSGEQ 
PPRLEAEGGLISPVWGAEGIPAPTCWIGTDPGGPSRAHQPQASD 
ANRE PVAERS E PALS GLP PATMGSGDLLLSGES QVEKTKLSS SE 
EFPQTLSLPRTTICSGHDADTEDDPSLADLPQALDLSQQPHSSG 
LSCLSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERAEPRG 
GSLAKVSSSLEPWPQEPSSWGLGPRPQWSPQPVFSGGDASGL 
GRRRLS FQAE Y WAC VLPDS LPPS PDRHS PLWNPNKE YEDLLDYT 
YP LRPG PQL PKHLDS R VPADP VLQDSGVDLDSFS VS PASTLKS P 
TNVS PNCP PAEATAL P FSGPRE PS LKQ W PS R VPQKQ GGMGLAS W 
SQLASTPRAPGSRDARWERREPALRGAKDRLTIGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSEEEVESDDEY 
LALPARLTQVSSLVS YLGS IS TLVTLPTGDI KGQS PLEVSDSDG 
PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 
EGSIX3SSQALGVSSGLLKTRPSLPARLDRWPFSDPDVEGQLPRK 
GGEQGKESL VQC \ VKT FC\ CQL EEL I CWL YNV\ AD VTDHGTPAR 
SNLTS LK\ S S I^JL YRQ FKKD I D EHQSLTE S VLQKGE I LLQCLLE 
NTPVLEDVLGRIAKQSGELESHADRLYDSILASLDMLAGCTLIP 
u KKPMAAM bH fCbG V 


5424 


3186 


905 


G VSMALGEEKAEAEAS EDTKAQS YGRGS CRERELD I PGPMSGEQ 
P PRLEAEGGL I S PVWGAEG I PAPTCW IGTDPGGPS RAHQ PQAS D 
ANREPVAERSEPALSGLPPATMGSGDLLLSGESQVEKTKLSSSE 
EFPQTLSLPRTTICSGHDADTEDDPSLADLPQALDLSQQPHSSG 
LS CLSQWKS VLSPGSAAQPSSCS ISASS TGSS LQGHQERAEPRG 
GSLAKVSSSLEPWPQEPSSWGLGPRPQWSPQPVFSGGnASGL 
GRRRLS FQAE YWACVL P DSLP PS PDRHS PLWNPNKE YE DLLDYT 
YPLR PGPQLP KHLDS RVPADP VLQDS G VDLDS FSVS PASTLKS P 
TNVS PNC P PAEATALP FSGPRE P SLKQW P S RVPQKQGGMGLAS W 
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Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N»Asparagine, 
P«Proline, Q=Glut amine, R«Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y:=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQ LASTPRAPG S RDARWERR E PALRGAKDRLT IGKHLDMG S PQL 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSEEEVESDDEY 
IaALPARLTQVSSIiVSY LGS I STLVTLPTGDI KGQS PLEVSDSDG 
PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 
EGSLGS SQALGVSSGLLKTRPS LPARLDRWPFSDPDVEGQLPRK 
GGEQGKESLVQC\VKTFC\CQLEELICWLYNV\ADVTDHGTPAR 
SNLTSLK\ S S LQLYRQFKKD IDEHQSLTES VLQKGE I LLQCLLE 
NTPVLEDVLGRIAKQSGELESHADRLYDSIIaASLDMIAGCTLIP 
DKKPMAAMEHPCEGV 


5425 


1086 


115 


GFCPSPSLGHQPPRVLHPTMSMAVETFGFFMATVGLLMLGVTLP 
NS YWRVSTVHGNVITTNTI FENLWFSCATDSLGVYNCWEFPSML 
ALSGYIQACRALMITAILLGFLGLLLGIAGLRCTNIGGLELSRK 
AKLAATAGAPH\ ILPGICGMVAI \SWYAFNITR\DFSDPLYPGT 
KYELGPALYLGWSASLISILGGLCLCSACCCGSDEDPAASARRP 
YQAP VS VM PVATS DQEGD S S FGKYGRNALR VAAL CRGPRCL PTA 
PKKRGPGRGPFPYSNLRGRPRPVPVAPPRPRPRVLHSHGPSQAK 
NCSWE VAYLPSEAGSLI F 


5426 


42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDQP 
PAAHAKPDPGSGGQPAGPGAAGEALAVLTS FGRRLLVL I PVYLA 
GAVGLSVGFVLFGLALYLGWRRVRDEKERSIjRAARQLLDDEEQL 
TAKTLYMSHRELPAWVSFPDVEKAEWLNKIVAQVWPFLGQYMEK 
LLAETVAPAVRGSNPHLQTFTFTRVELGEKPLRIIGVKVHPGQR 
KEQILLDLNISYVGDVQIDVEVKKYFCKAGVKGMQLHGVLRVIL 
EPLIGDLPFVGAVSMFFIRRPTLDINWTGMTNLLDIPGLSSLSD 
TMIMDS IAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGI IRIHL 
LAARGLSSKDKYVKGLIEGKSDPYALVRLGTQTFCSRV1DEELN 
PQWGETYEVMVHEVPGQEIEVEVFDKDPDKDDFLGRMKLDVGKV 
LQASVLDDWFPLQGGQGQVHLRLEWLSLLSDAEKLEQVLQWNWG 
VSSRPDPPSAAILWYLDRAQDLPMVTSELYPPQLKKGNKEPNP 
MVQLSIQDVTQESKAVYSTNCPVWEEAFRFFLQDPQSQELDVQV 
KDDSRALTLGALTLPLARLLTAPELILDQWFQLSSSGPNSRLYM 
KLVMRILYIiDSSEICFPTVPGCPGAWDVDSENPQRGSSVDAPPR 
PCHTTPDSQFGTEHVLRIHVLEAQDLIAKDRFLGGLVKGKSDPY 
VKLKLAGRSFRSHWREDLNPRWNEVFEVIVTSVPGQELBVEVF 
DKDLDKDDFLGRCKVRLTTVXiNSG flde wltled vp sgrlklrl 
ERLTP R P TAAE LE E VLQVNS L I QTQ KS AELAAALLS I YMERAE D 
LPLRKGTKHLS P YATLTVGDSS HKTKT I SQTSAPVWDESASFLI 
RKPHTESLELQVRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 
SSGQGQVLLRAQLGILVSQHSGVEAHSHSYSHSSSSLSEEPELS 
GGPPHITSSAPEV\RQRLTHVDSPLEAPAGPLGQVKLTLWYYSE 
ERKLVSIVHGCRSLRQNGRDPPDPYVSLLLLPDKNRGTKRRTSQ 
KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 
LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5427 


42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDQP 
PAAHAKPDPGSGGQPAGPGAAGEALAVLTSPGRRLLVLIPVYLA 
GAVGLSVGFVLFGLALYLGWRRVRDEKERSLRAARQLLDDEEQL 
TAKTL YMSHRELPAWVS F PD VE KAE WLNKI VAQVW PFLGQ YMEK 
LLAETVAPAVRGSNPHLQTFTFTRVELGEKPLRIIGVKVHPGQR 
KEQILLDLNISYVGDVQIDVEVKKYFCKAGVKGMQLHGVLRVIL 
EPLIGDLPFVGAVSMFFIRRPTLDINWTGMTNLLDIPGLSSLSD 
TMIMDS I AAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGI IRIHL 
LAARGLSSKDKYVKGLIEGKSDPYALVRLGTQTFCSRVIDBELN 
PQWGETYE VMVHEVPGQE I EVEVFDKDPDKDDFLGRMKLDVGKV 
LQASVLDDWFPL^GQGQVHLRLEWLSLLSDAEKLEQVLQWN^JG 
VSSRPDP P SAAI LWYLDRAQDLPMVTS EL YPPQLKKGNKE PNP 
MVQLSIQDVTQESKAVYSTNCPVWEEAFRFFLQDPQSQELDVQV 
KDDSRALTLG ALT L PLARLLTAP E L I LDQ W FQLS S S G PNSRL YM 
KLVMRILYLDSSEICFPTVPGCPGAWDVDSENPQRGSSVDAPPR 
PCHTTPDSQFGTEHVLRIHVLEAQDLIAKDRFLGGLVKGKSDPY 
VKLKLAG RS FRSHWREDLNPRWNE VFE VI VTSVPGQELE VE V F 
DKDLJDKDDFIjGRCKVRLTTVIiNSGFIjDEWLTLEDVPSGRLHLRL 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KaLysine, 
L=Leucine, M=Methionine, N»Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ERLTPRPTAAELEEVLQVNSLIQTQKSAELAAALLS I YMERAED 
L PLR KGTKHL S P YATLTVGDSSHKTKT I S QTS AP VWDES AS F Ii I 
RKPHTESLELQ VRGEGTG VLG S LS LPLS E LLVADQLCLDRWFTL 
SSGQGQVLLRAQLGILVSQHSGVEAHSHSYSHSSSSLSEEPELS 
GG P PHI TS S APEV \ RQRLTHVDS PLEAPAG PLGQVKLTLW YYS B 
ERKLVSIVHGCRSLRQNGRDPPDPYVSLLLLPDKNRGTKRRTSQ 
KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 
LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5428 


3 


1839 


S S RS ERLS ACA I AP P WL VSSR PAR PAQLQRPG JCMVEDGAEELED™ 
LVHFSVSELPSRGYGVMEEIRRQGKLCDVTLKIGDHKFSAHRIV 
LAAS I PYFHAMFTNDMMECKQDE I VMQGMDP S ALE AL I N FAYNG 
NLAIDQQNVQSLLMGAS FLQLQS I KDACCTFLRERLHPKNCLGV 
RQFAETMMCAVIiYDAANSFIHQHFVEVSMSEEFLALPLEDVLEL 
VSRDELNVKSEEQVFEAALAWVRYDREQRGTFL\RNLQSNIRLL 
FCRPQFLSDRVQQDDLVRCCHKCRDLVDEAKDYLLMPERRPHLP 
AFRTRPRCCTSIAGLIYAVGGLNSAGDSLNWEVFDP1ANCWER 
CRPMTTARSRVGVAWNGLL YAI GGYDGQLRLS TVQAYNTETDT 
WTRVGSMNS KRS AMGT WLDGQ I YVCGG YDGNS SLS SVET YS PE 
TDKWTWTS MS S NRS AA \G VT VFEGR I Y VS GGHDG LQ I FS S VEH 
YKHH TATWHPAAGMLNKRCRHGAAS LG S KM F VCGG YDGS G F LS I 
AEM YSS V\ADQWCLIVPM\HTRR \ SRVSLGGPAVGRLYAVWGVT 
TGQSNL\SS VGDVLTPETDCWTFM \ APMACHEGGVGVG C I PLLT 
I 


5429 


82S 


202 


RREDALS SEGCLWPSESTVSGNG I PE PQVYAPPRPTDRLAVPPF~ 

AQRERFHRFQPTYPYLQHEIDLPPTISLSDGEEPPPYQGPCTLQ 

LRDPEQQLELNRESVRAPPNRTIFDSDLMDSARLGGPCPPSSNS 

GISATCYGSGGRMEGPPP\TYSEVIGHYPGSSFQHQQSSGPPSL 

LEGTRLHHTHIAPLESAAIWSKEKDKQKGHPL 


5430 


441 


1507 


QKRRKRRRKKIMKTIQPKMHNS ISWAI FTGLAALCLFQGVPVRS 
GDAT F P KAMDNVTVRQGES ATLRCT I DNRVTRVAW LNRS T I L YA 
GNDKWCLDPRWLLSNTQTQYSIEIQNVDVYDEGPYTCSVQTDN 
HPKTSRVHLIVQVSPKIVEISSDISINEGNNISLTCIATGRPEP 
TVTWRHISPKAVGFVSEDEYLEIQGITREQSGDYECSASNDV\A 
AP V\ VRR VKVT VN YP P Y I S EAKGTG VPVGQKGTLQ CEAS AVPS A 
EFQWYKDDKRL I / EGKKGVKVENRPFLS KLIFFNVSEHDYGNYT 
CVASNKLGHTNAS I ML FG PGAVS EVSNGTS RRAG CVWLLPLLVL 
HLLLKF 


5431 


n 
A 


1312 


AAAAPGS RRRR PLPDRPHMAHGYEAPPPPAPRSPAWRARSKP V\ 
LPGITINP\TIAEGPSP\TSEGASEANLVDLQKKLEELELDEQQ 
KKRLEAFLTQKAKVGELKDDDFERISBLGAGNGGWTKVQHRPS 
GLIMARKLIHLEIKPAIRNQIIRELQVLHECNSPYIVGFYGAFY 
SDGEISI CMEHMDGGS LDQVLKEAKRI PEE ILGKVS I AVLRGLA 
YLREKHQIMHRDVKPSNILVNSRGEIKLCDFG7SGQLIDSMANS 
FVGTRSYMAPERLQGTHYSVQSDIWSMGLSLVELAVGRYPIPPP 
DAXELEAI FGRPWDGEEGEPHS ISPRPRPPGRPVSGHGMDSRP 
AMAIFELLDYIVNEPPPKLPNGVFTPDFQEFVNKCLIKNPAERA 
DLKMLTNHTFIKRSEVEEVDFAGWLCKTZjRLNQPGTPTRTAV 


5432 


•5 


1312 


AAAAPGS RRRRP L PDRPHMAHG YEAPP P PAPRS PAWRARS K P V \ 
LPGITINP\TIAEGPSP\TSEGASEANLVDLQKKLEELELDEQQ 

kkrleafxtqkakvgelkdddferiselgaghggvvtkvqhrps 

G L I MARKL I HLE T KP ATE NO T T P FT .n\7T .T-TP rw Q P V T \rn i? vr» a tjv 
SDGEISICMEHMDGGSLDQVLKEAKRIPEEILGKVSIAVLRGLA 

ylrekhqimhrdvkpsn-ilvnsrgeiklcdfgvsgqlidsmans 
fvgtrsymaperlqgthysvqsdiwsmglslvelavgrypippp 
dakeleaifgrpwdgeegephsisprprppgrpvsghgmdsrp 
amaifelldyivnepppklpngvftpdfqefvnkcliknpaera 
dlkmltnhtfikrseveevdfagwlcktlrlnqpgtptrtav 


5433 


360 


1885 


SVQEDKVGFEDPI^CSWRARACPCTWPHC/CTGLIiECLGFAGV 
LFGWPSLVFVFKNEDYFKDLOGPDAGPIGNATGQADCKAQDERF 

slifti^sfmiwfotfptgyifdrfkttvarliaiffyttatli 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«-Phenylalanine, Q=Glycine, 
H=Histidine, I=rsoleucine, K=Lysine, 
L=Leucine, M-Methionine, N^Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X^Unknown, *=«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








I AFTSAGS AVLLFLAM PMLT I GG I LFL I TNLQ I GNL FGQHRST I 
ITLYNGAFDSSSAVFLIIICLLYEKGISLR/VLLHIjHLCLQYLAC 
S TH FPPDAPGAHP I P TAPQLQLW P VPWEWHHKGREG /QQLS MKT 
GS YSQRSS FQRRKR PQGQGRS RNS APSGATL / CS RRFAWHLVWL 
SVIQLWHYLFIGTLNSLLTNMAGGDMARVSTYTNAFAFTQFGVL 
CAPWNGLLMDRLKQKYQKEARKTGSSTLAVALCSTVPSLALTSL 
LCLGFALCAS VPILPLQYLTF I LQVI SRS FL YGSNAAFLTLAFP 
SEHFGKLFGLVMALSAWSLLQFPIFTLIKGSLQNDPFYVNVMF 
MLA I LLTFFHP FLV YRE CRTWKE S PS AI A 


5434 


66 


652 


R YAAL 1 1 S L I QHKLLWRNQHCS RC VIMSPAQSAGLNWLF / GSGK 
HGPFLGCSQYPACDYVRPLKSSADGHIVKVLEGQVCPACGANLV 
LRQGRFGMFIGCINYPECEHTBLIDXPDBTAITCPOCRTGHLVQ 
RRSRYGKTFHSCDRYPECQFAINFKPIAGBCPECHYPLIiIEKKT 
AQG VKHFCAS KQCGKP VS AE 


5435 


4704 


1597 


PGDSSQRLAEMSNAKERKHAKKMRNQPTNVTLSSGFVADRGVKH 
HSGGEKPFQAQKQEPHPGTSRQRQTRVNPHSLPDPEVNEQSSSK 
GMFRKKGGWKAGPEGTSQE I PKY I TASTFAQARAAE IS AMLKAV 
TQKSSNSLVFQTLPRHMRRRAMSHNVKRLPRRLQEIAQKEAEKA 
VHQKKEHS KN KCHKARRCHMNRTLE FNRRQKKN I WLETH I WHAK 
R FHMVKKWG YCLGER P T VKS HRAC YRAMTNRCLLQDLS YYCCL E 
LKGKEEE1LKALSGMCNIDTGLTFAAVHCLSGKRQGSLVLYRVN 
KYPREMLGPVTFIWKSQRTPGDPSESRQLWIWLHPTLKQDILEE 
I KAACQCVEPI KSAVCIADPLPTPSQEKSQTELPDEKIGKKRKR 
KDDGENAKPIKKI IGDGTRDPCLP YS W I S PTTG III SDLTMEMN 
RFRliIGPLSHS ILTEAI KAASVHTVGEDTEETPHRWWIETCKKP 
DS VS LHCRQEAI FELLGG I TSPAE I PAGTILGLTVGDPRINLPQ 
KKSKALPNPEKCQDNEKVRQLLLEGVPVECTHS FIWNQDI CKSV 
TENKISDQDLNRMRSELLVPGSQLILGPHESKIPILLIQQPGKV 
TGEDRLGWG S G WDVLLP KGWGMAFW I P FI YRG VR VGGLKES A VH 
S Q YKRS PNVPGDFPD CP AGM LFAE EQ AKNLLE KYKRRP P AKRPN 
YVKLGTLAPFC CPWEQLTQDWESR VQAYE EPS VAS S PNGKE SDL 
RRS EVP CAPM P KKTHQP S DEVGTS I EHPREAEE VMDAGCQES AG 
PERITDQEASENHVAATGSHLCVLRSRKL.LKQLSAWCGPSSEDS 
RGGRRAPGRGQQGLTREACLS ILGHFPRALVWVSLSLLSKGS PE 
PHTM I CVPAKEDFLQLHEDWH YCG PQES KHSDP FRS KI LKQKEK 
KKREKRQKP\GRASSDGPAGEEPVAGQEALTLGLWSGPLPRVTL 
HCSRTLLGFVTQGDFSMAVGCGEALGFVSLTGLLDMLSSQPAAQ 
RGLVLLRPPASLQYRFARIAIEV 


5436 


1781 


635 


ASDS I PWSEARTTRKLAQRGCQWSLPERMPLWFCGLP YSGKSR 
RAEELR VALAAEGRAVYWDDAAVLGAED PAVYGDSAREKALRG 
ALRASVERRLSRHDWILDSLNYIKGFRYELY\CLARAARTPLC 
LVYCVRPGGPIAGPQVAGANENPGRNVSVSWRPRAEEDGRAQAA 
GSS VLRELHTADS WNGSAQADVPKELEREESGAAES PALVTPD 
SEKSAKHGSGAFYSPELLEALTLRFEAPDSRNRWDRPLFTLVGL 
EEPLPLAGIRSALFENRAPPPHQSTQSQPLASGSFLHQLDQVTS 
QVLAGLMEAQKSAVPGDLLTLPGTTEHLRFTRPLTMAEIiSRLRR 
QFISYTKMHPNNENIfPQLANMFLQYLSQSLH 


5437 


739 


1672 


CQEAASEFGGPLHTPAMFLRRLGGWLPRPWGRRKPMRPDPPYPE 
PRRVDSSSENSGSDWDSAPETMEDVGHPKTKDSGALRVSRAASE 
P S KE E PQVEQLGS KRMDSLKWDQ P ISS TQE S GRLE AGGAS PKLR 
WDHVDSGGTRRPGVSPEGGL\GVPGPGAPLEKPGRREKLLGWIiR 
GEPGAP SR YLGG P EECLQI S TNLTLHLLELLASALLALCSRPLR 
AALDTLGLRGPIiGLWLHGLLS FLAALHGLHAVLSLLTAHPLHFA 
CLFGLLQALVLAVSLREPNGDEAATDWESEGLEREGEEQRGDPG 
KGL 


5438 


2443 


1152 


TKPRKRRHQPASQRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 
LAPPSLRRPMMCQSEARQGPELRAAKWI^FPQLAIiRRRLGQLSC 
MSRPALKLRSWPLTVLYYLLPFGALRPLSRVGWRPVSRVALYKS 
VPTRLIiS RAWGRLNQVE LPHWLRRP VYS L Y I WTFGVNMKE AAVE 
DLHHYKNLSEFFRRKLKPQARPVCGLHSVISPSDGRILNFGQVK 
NCEVEQVKGVTYSLESFLGPRMCTEDLPFPPAASCDSFKNQLVT 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine , G«Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L-Leucine, M=Methionine , N=Asparagxne, 
P=Proline / Q=Glutamine, R=Arglnine, 
S=Serine, T=Threonine, V^Valine, 
W tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








REGNELYHCV I YLAPGDYHCFHS PTDWTVSHRRHPPGSLMS VNP 
GMARWIKELFCHNERWLTGDWKHGFFS LTA VGAT \ NWG S IR I Y 
FDRDLHTNS PRHS KGS YNDFS FVTHTNREG VPMALRGEHLG / Q S 
FNLGSTTVLIFEAPKDFNFQLKTGQKIRFGEALGSL 


543 9 


2443 


1152 


TKPRKRRHQPASQRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 
IAPPSLRRPMMCQSEARQGPELRAAKWLHFPQLALRRRLGQLSC 
MSRPALKLRSWPLTVLYYLLPFGALRPLSRVGWRPVSRVALYKS 
VPTRLLSRAWGR I/NQVELPH WLRRP VYS L Y I WTFG VNMKBAAVE 
DLHHYRNL S E FFRRKLKPQARP VCGLH S VI S P S DGR I LNFGQ VK 
NCEVEQVKGVTYSLESFU3PRMCTEDLPFPPAASCDSFKNQLVT 
REGNELYHCVIYLAPGDYHCFHSPTDWTVSHRRHFPGSLMSVNP 
GMARW I KELFCHNE RWLTGDWKHGFFS LTAVGAT \NWGS I R I Y 
FDRDLHTNS PRHS KGS YNDFS FVTHTNREGVPMALRGEHLG /QS 
FNLGS T I VL I FEAP KDFNFQLKTGQKIRFGEALGSL 


5440 


693 


253 


EPIPVTPDHRLVTMTHIV\QTFSPVNS\GQPPNYEMLKEEQEVA 
MLGAPHNPAPPMSTVIHIRSETSVPDHVVWSLFNTIiFMNTCCLG 
F IAFAYS VKSRDRKMVGD VTGAQ AYAS T AKCLN I WAL I LG I FMT 
IliLIIIPVLWQAQR 


5441 


2 


2054 


CRDGGKNGFMVS PMKPLE I KTQCSGPRMDPKI CPADPAFFS FIN 
NSDLWVANIETGEERRLTFCHQGLSNVLDDPKSAGVATFVIQEE 
FDRFTGYWWCPTASWEGSEGLKTLRILYEEVDESEVEVIHVPSP 
ALE ERKT DS YR YP RTGSKNP K I ALKLAE FQTDSQGK I VSTQEKE 
L VQPF S S L F P KVE Y I ARAG WTRDGKYAWAM FLDR PQQWLQLVLL 
PPALFIPSTENEEQ\RLASARAVPRNVQPYWYEEVTNVWINVH 
DIFYPFPQSEGEDELCFLRANECKTGFCHLYKVTAVLKSQGYDW 
SEPFSPGEGEQSLTNAIWVNEETKLVYFQGTKDTPLEHHLYWS 
YEAAGEIVRLTTPGFSHSCSMSQNFDMFVSHYSSVSTPPCVHVY 
KLSGPDDDPLHKQPRFWASMMEAAKIFHFHTRSDVRLYGMIYKP 
HALQPGKKHPTVLFVYGGPQVQLVNNSFKGIKYXiRLNTLASLGY 
A WVIDGRGS CQRGLRFEGALKNQMGQ VE I EDQ VEGLQ FVAEKY 
GFIDLSRVAIHGWSYGQFLSLMGLIHKPQVFKVAIAGAPVTVWM 
AYDTG YTERYMD VPENNQHG YE AG S VALHVE KL PNE PNRLLI LH 
GFLDENVHFFHTNFLVSQLIRAGKP YQLQVALPPVS PQI YPNER 
HS IRCPESGEHYEVTLLHFLQEYL 


5442 


1 


34 74 


CGQRSRRRS PDMPEAKPAAKKAPKGKDAP KGAP KEAPPKEAPAE 
AP KEAP P ED QSPTAEE PTGVFLKKPDSVS VETO KDAWVAKVNG 
KE LPDKPT I KWFKGKWLELGS KSGARFS FKE SHNS ASNVYTVEL 
HIGKWLGDRGYYRLEVKAKDTCDSCGFNIDVEAPRQDASGQSL 
ESFKRTSEKKSDTAGELDFSGLLKKREVVEEEKKKKKKDDDDLG 
I P PE I WE LLKGAKKS E YE KI AFQYG I TDLRGMLKRLKKAKVE VK 
KSAAFTKKLDPAYQVDRGNKIKLMVEISDPDLTLKWFKNGQEIK 
PSSKYVFENVGKKRILTINKCTLADDAAYEVAVKDEKCFTELFV 
KE P P VL I VT P LEDQQ VFVGDR VEMAVE VS E EGAQVMWMKDG VEL 
TREDSFKARYRFKKDGKRHILIFSDWQEDRGRYQVITNGGQCE 
AEL I VE E KQLEVLQD I ADLTVKAS EQAVFKCE VS DEKVTG KW YK 
NGVEVRPSKRITISHVGRFHKLVIDDVRPEDEGDYTFVPDGYAL 
GSLSAKLNFLEIKVEYVPKQ\EPPKI PLGFASGGKTSENAD/ 1 V 
WAGNKLRLDV\SITGEAPSPFAT\WLKG\DEVFTTTEGRTRIE 
KRVDCSSFVIESAQREDEGRYTIKVTNPIGEDVASIFLQWDVP 
DPPEAVRITSVGEDWAILVWEPPMYDGGKPVTGYLVERKKKGSQ 
RWMKLNFEVFTETTYESTKMIEGILYEMRVFAVNAIGVSQPSMN 
TKP FMP IAPTS EPLHLI VED VTDTTTTLKWR P PNR I GAGG I DG Y 
LVEYCLEGSEEWVPANTEPVERCGFTVKNLPTGARILFRWGVN 
IAGRSEPATLAQPVTIREIAEPPKIRLPRHLRQTYIRKVGEQLN 
LWPFQGKPRPQWWTKGGAPLDTSRVHVRTSDFDTVFFVRQAA 
RSDSGEYBLSVQIENMKDTATIRIRVVEKAGPPINVMVKEVWGT 
NALVEWQAPKDDGNSEIMGYFVQKADKKTMEWFNVYERNRHTSC 
TVSDLIVGNEYYFRVYTENICGLSDSPGVSKNTARILKTGITFK 
PFEYKEHDFRMAPKFLTPLIDRVWAGYSAALNCAVRGHPKPKV 
VWWKNKMEIREDPKFLI TNYQG VLTLNI RRPS P FDAGTYTCRAV 
NELGEALAECKLEVRVPQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, ^Threonine, V-Valine, 
W«Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5443 


66 


1003 


S RGQIiDAGQS S EQHGGNRQPEQS RS RSS S S S S S PRRSRS AAE PA 
MALSMPLNGLKEEDKEPLIELFVKAGSDGESIGNCPFSQRLFMI 
LWLKGWFS VTTVDLKRKPADLQNLAPGTHP PFITFNS EVKTDV 
NKI EEFLE EVLCPPKYLKLS PKHPESNTAGMDI FAKFSAYI KNS 
RPEANEALERGLLKTLQKLDEYLNSPLPDEIDENSMEDIKFSTR 
KFLDGNEMTIADCNIiIjPKLHIVKVVAKKYRNFDIPKEMTGIWRY 
LTNAYSRDEFTNTCPSDKEVEl\AYSDVAKRLHQVKSRLLKEVS 
FMSSP 


5444 


2 


344 


SGP IG VTGAQMAKWLRDYLS FGGRRPPPQP PTPDYTESD ILRAY 
RAQKNLDFEDPY*DSESRLEPDPAGPGDSKNPGDAKYGSPKHRL 
I KVEAADMARAKALLGGPGEE LE ADTEYLD P FDAQ PHP AP PDDG 
YMEPYDAQWVMSELPGRGVQLYDTPYEEQDPETADGPPSGQKPR 
QSRMPQEDERPADE YDQPWEWKKDHI SRAFAVQFDS PEWERTPG 
SAKELRRPPPRSPQPAERVDPALPLEKQPWFHGPLNRADAESLL 
SLCKEGSYLVRLSETNPQDCSLSLRSSQGFLHLKFARTRENQW 
LGQHS GPFPS VPEL VLHYS 5R PL P VQGAEHLALL YP WTQTP * Q 
*PDWGDRRPNGQVATGLPELWGAEAPSAAAHPGLHRERHPEGLP 
RAE KPGLRG P LLGLRE P LGAG PRG P WGLQE PRRCQ VWF S QAPAH 
QGGGCGYGQSQGPSGRPRGGAGSRH 


5445 


2364 


486 


ILSRGFLGSVEICIQLPLPASEPVLLLTWARRRWRETRSRREPT 
TLRAQSVC PWW I * ETRMNRS I PVEVDESEP YPS QLLKP I PEYS P 
EEESEPPAPNI RNMAP NS LS APT MLHNS S GD FS Q AHSTLKLANH 
QRP VS RQVTCLRTQVLE DS E DS F CRRHPGLGKAFPSGC S AVSE P 
AS E S WGALPAEHQFS FME KRNQ WL VSQLS AAS PDTGHDS DKS D 
QSLPNASADSLGGSQEMVQRPQPHRNRAGLDIiPTIDTGYDSQPQ 
DVLGIRQLERPLPLTSVCYPQDLPRPLRSREFPQFEPQRYPACA 
QMLPPtfLSPHAPWNYHYHCPGSPDHQVPYGHDYPRAAYQQVIQP 
ALPGQPLPGASVRGLHPVQKVILNYPSPWDQEERPAQRDCSFPG 
LPRHQDQPHHQPPNRAGAPGESLECPAELRPQVPQPPSPAAVPR 
PPSNPPARGTLKTSNLPEELRKVFITYSMDTAMEWKFVNFLLV 
NG FQTAI D I FEDR IRG I D 1 1 KWMER YLRD KTVM 1 1 VAI S P K YKQ 
DVEGAESQLDEDEHGLHTKYIHRMMQIEFIKQGSMNFRFIPVLF 
PNAKKEHVPTWLQNTHVYSWPKNKKNILLRLLREEEYVAPPRGP 
LPTLQWPL 


5446 


972 


161 


S S WS WCTGRMRKTRLWGLL WMLF VS ELRAATKLTEEK YEL KEGQ 
TLD VKCD YTLE KFAS S QKAWQI I RDGEMPKTLACTERP S KNSH P 
VQVGRIILEDYHDHGIiLRVRMVNLQVEDSGLYQCVIYQPPKEPH 
ML FDR I RLWTKG FSGTPGSNENSTQNVYKI P PTTTKALC PLYT 
TPRTVTQAPPKS TADVSTPDS EINLTNVTDI IRVPVFNI VILLA 
GGFLS KS LVFS VL FAVTLRS FVP * AHE PTRMS SDFQPHP SGS CA 
KGGGRR 


5447 


207 


617 


MTARTLS LMASLVAYDDS DS EAETEHAGSFNATGQQKDTSGVAR 
PPG QDFASGTLD VPKAGAQPTKHGS CE DPGG YRLPLAQLGRS DR 
GSCPSQRLQWPGKEPQVTFPIKEPSCSSLWTSHVPASHMPLAAA 
RFKQVKLSRNFPKSSFHAQSESETVGKNGSSFQKKKCEDCWPY 
TPRRLRQRQALS TETX3KGKD VEPQGP PAGRAPAPL YVG PGVSEF 
IQPYLNSHYKETTVPRKVLFHLRGHRGPVNTIQWCPVLSKSHML 
LSTSMDKTF KVWNAVDS GHCLQTYS LHTEAVRAARWAP CGRRI L 
SGGFD FALHLTDLETGTQLFSGRSD FR I TTLKFHP KDHN I FLCG 
GFSS EMKAWD IRTGKVMRS YKAT IQQTLDI LFL REGS EFLS S TD 
ASTRDSADRTI I AWDFRTSAKISNQI FHERFTCPSLALHPREPV 
r LJ\\2 1 JMWN x iiALi c is L VWP YRMSRRRRYEGHKVEGYSVGCECSPG 
GDLLVTG S ADGRVLM YS FRTAS RACTLQGHTQAC VGTTYHP VLP 
S VLATCS WGGDMKI WH+AFHWLSLGEA I GDLAPARG YSG PGRS L 
KSPSPSKSLLVLLCGRAMFQPATCPWQLPALSK 


5448 


194 


1833 


MASKVTDAIVWYQKKIGAYDQQI WEKSVEQRE I KGLRNKP KKTA 
HVKPDLI DVDLVRGSAFAKAKP ESPWTSLTTKG I VRWFFP FF F 
RWWLQVTSKVI FFWLLVLYLLQVAAI VLFCSTS S PHS I PLTEVI 
GP I WLMLLLGTVHCQ I VS TRTP KP P LS TGGKRRRKLRKAAHLE V 
HREGDGS S TTDNTQ EGAVQNHG T STSHS VGTVFRDLWHAAFFLS 
GSKKAKNS IDKS TETDNG YVS LDGKKTVKSGEDGIQNHEPQCET 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first ' 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, DeAspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, M^Methionine, N«Asparagine , 
P«Proline, Q«Glutamine, R=Arginine, 
S=Serine, T=Threonine, V» Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETAWNTGTLRNGPSKDTQRTI TNVSDEVS SEEGPETGYSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSESARPESETEDVLWEDLLHCAECHSSCTSETDVENHQINPC 
VKKEyRDDPFHQSHLPWLHSSHPGLEKISAIVWEGNDCKKADMS 
VLBISGMIMNRVNSHI PGIG YQI FGNAVSL ILGLTPF VFRLS QA 
TDLEQLTAHSASELYVIAFGSNEDVI VLSMVI I S FWRVSLVWI 
FF FLL CVAERT Y KQVG I M * TSEG VLRNR KSHHY KKHYPNE DAPK 
SGTSCSSRCSSSRQDSESARPESETEDVLWEDLLHCAECHSSCT 
SETDVENHQINPCVKKEYRDDPFHQSHLPMLHSSHPGLEKISAI 
VWEGNDCKKADMS VLEI SGM IMNRVNSH I PGIG YQ I FGNAVS LI 
LGLTPFVFRLSQATDLEQLTAHSASELYVIAFGSNEDVIVLSMV 
I IS FWRVSLVWI FFFLLCVAERTYKQVGIM 


5449 


194 


1833 


MAS KVTDAI VW YQKKIGA YDQQI WEKS VEQRE I KGLRNKP KKLTA 
HVKPDL I DVDLVRGS AFAKAKPES P WTS LTTKG I VRWFFPFFF 
RWWLQ VTS KV I FFWLLVL YL LQ VAA I VL FCSTS S PHS I PLTE VI 
GPIWLMLLLGTVHCQIVSTRTPKPPLSTGGKRRRKLRKAAHLEV 
HREGDGSSTTDNTQEGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
GSKKAKNS IDKSTETDNGYVSLDGKKTVKSGEDG IQNHE PQCET 
IRPEETAWNTGTLRNGPS KDTQRTI TNVS DEVS S EEGPE TG YS L 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDS ES AR P ES ETED VLW EDLLHCAE CHS S CTS BTDVENHQ INPC " 
VKKEYRDDPFHQSHLPWLHSSHPGLEKISAIVWEGNDCKKADMS 
VLE I S GM I MNR VNS HIPGIGYQI FGNAVSL ILGLTPFVFRLS QA 
TDLEQLTAHSASELYVIAFGSNEDVI VLSMVI IS FWRVSLVWI 
FFFLLCVAERTYKQVGIM*TSEGVLRNRKSHHYKKHYPNEDAPK 
SGTSCSSRCSSSRQDSESARPESETEDVLWEDLLHCAECHSSCT 
S ETD VENHQINP CVKKE YRDDP FHQSHLPWLHS S HPGLE KI S AI 
VWEGND CKKADMS VLEI S GM I MNRVNS H I PG IG YQ I FGNAVS L I 
LGLT P FVFRLS QAT DLEQLTAHS AS EL YVI AFGS NE DV I VLS MV 
1 1 SFWRVSLVW I FFFLLCVAERTYKQVGIM 


5450 


3136 


1242 


GQQFASFFG*NHPEVTVAMALTDIDLQLQFSMSQPEALLLLAAG ' 

PADHLLLQLYSGHLQVRLVLGQEELRLQTPAETLLSDSIPHTW 

LTWEGWATLSVDGFLNASSAVPGAPLEVPYGLFVGGTGTLGLP 

YLRGT SR P LRGCLHAATLNGRSLLR PLTPDVHEG CAEEFSAS DD 

VALGFSGPHSLAAFPAWGTQDEGTLEFTLTTQSRQAPLAFQAGG 

RRGD F I YVD I FEGHLRA WE KG QGTVLLHNS VPVADGQPHE VS V 

HINAHRLE I S VDQYPTHTSNRG VLS YLEPRGSLLLGGLDAEASR 

HLQEHRLGLTPEATNASLLGCMEDLSVNGQRRGLREALLTRNMA 

AGCRLEEEEYEDDAYGHYEAFSTLAPEAWPAMELPEPCVPEPGL 

PPVFANFTQLLTISPLWAEGGTAWLEWRHVQPTLDLMEAELRK 

SQVLFSVTRGAHYGELELDILGAQARKMFTLLDWNRKARFIHD 

GSEDTSDQLVLEVSVTARVPMPSCLRRGQTYLLPIQVNPVNDPP 

HI I FPHGSLMVILEHTQKPLGPEVFQAYDPDSACEGLTFQVLGT 

SSGLPVERRDQPGEPATEFSCRELEAGSLVYVHCGGPAQDLTFR 

VS DGLQAS P P ATLKWAIRPA I Q I HR S TGLRLAQG SAMP I L PAN 

LS VETNAVGQDVS VLFR VTGALQFGELQKHS TGGVEGAE WWATQ 

AFHQRDVEQGRVRYLSTDPQHHAYDTVENLALEVQVGQEILSNL 

S FP VT I QRAT VWMLRLE PLHTQNTQQE TLTTAHLEATLEEAG PS 

PPT FHYE WQAPRKGNLQLQGTRLSDGQG FTQDD I QAGRVT YGA 

TARASEAVE DTFRFRVTAP P Y FS PL YT FP I H I GGD PDAP VLTNV 

LL WPEGGEG VLSADHL FVKS LNS AS YL YE VMERPRLGRLAWRG 

TQDKTTMVTSFTNEDLLRGRLVYQHDDSETTEDDIPFVATRQGE 

S SGDMAWEEVRGVFRVAIQPVNDHAPVQTI SRIFHVARGGRRLL 

TTDDVAFSDADSGFADAQLVLTRKDLLFGSIVAVDEPTRPIYRF 

TQBDLRKRR VLFVHS GADRG W IQLQVSDGQHQATALLEVQAS EP 

YLRVANGSSLWPQGGQGTIDTAVLHLDTNLDIRSGDEVHYHVT 

AGPRWGQLVRAGQPATAFSQQDLLDGAVLYSHNGSLSPEDTMAF 

SVEAGPVHTDATLQVTIALEGPLAPLKLVRHKfaYVFQGEAAEI 

RRDQLEAAQEAVPPAD I VFS VKSPPSAGYLVMVS RGALADE P PS 

LD P VQS FS QEAVDTG RVLYLH S RPEAWS DAFS LD VASGLGAP LE 

GVLVELEVLPAAIPLEAQNFSVPEGGSLTLAPPLLRVSGPYFPT 



324 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding . 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L«Leucine, M»Methionine , N«Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLGLSLQVLEPPQHGPLQKEDGPQARTLSAFSWRMVEEQLIRYV 
HD G S E TLTDS F VLMANAS EMDRQS HP VAFT VT VL P VNDQ P P I LT 
TNTGLQMWEGATAPIPAEALRSTDGDSGSEDIjVYTIEQPSNGRV 
VLRGAPGTEVRSPTQAQLDGGLVLFSHRGTLDGGFPFRLSDGEH 
TSPGHFFRVTAQKQVLLSLKGSQTLTVCPGSVQPLSSQTLRASS 
SAGTDPQLLLYRVVRGPQLGRLFHAQQDSTGEALVNFTQAEVYA 
GN 1 L Y EHEMP P E P FWEAHDT L ELQLS S P PARDVAATLAVAVS FE 
AACPQRPSHLWKNKGLWVPEGQRARITVAALDASNLLASVPSPQ 
RSEHDVLFQVTQFPSRGQLLVSEEPLHAGQPHFLQ3QLAAGQLV 
Y AHGGGGTQQDG FHFRAHLQGPAGAS VAG PQTS EAFAI T VRD VN 
ERPPQPQASVPLRLTRGSRAPISRAQLSVVDPDSAPGEIEYEVQ 
RAPHNGFLS LVGGGLG P VTRFTQAD VDS GRIiAFVANGS S VAG I F 
QLSMSDGASPPLPMSLAVDIIiPSAIEVQIiRAPLEVPQALGRSSL 
SQQQLRWSDREEPEAAYRLIQGPQYGHLLVGGRPTSAFSQFQI 
DQGEWFAFTNFSSSHDHFRVLALARGVNASAVVNVTVRALLHV 
WAGG P WPQGATLRLDPTVLDAGELANRTGS VPR FRLLEGPRHGR 
VVRVPRARTEPGGSQLVEQFTQQDLEDGRLGLEVGRPEGRAPGP 
AGDSI/TIjELWAQGVPPAVAS LDFATE PYNAARP YSVALLS VP EA 
ARTEAGKPESSTPTGEPGPMASSPEPAVAKGGFIiSFLEANMFSV 
1 1 PMCLVLLLLAL I LPLLFYLRKRNKTGKHDVQVLTAKPRNGLA 
GDTETFRKVEPGQAIPLTAVPGQGPPPGGQPDPELLQFCRTPNP 
ALKNGQYWV 


5451 


1 


2274 


RDS S EQGRTGDTLGRPS ACMD ALKP PCLWRNHE RG KKDRDS CGR 
KNSEPGSPHS LEALRDAAPS QGLN FLLL FTKML F I FNFL FS P LP 
TPAL I CI LTFGAAI FLWLI TRPQPVLPIjIjDLNNQSVGIEGGARK 
GVSQKNNDLTSCCFSDAKTMYEVFQRGLAVSDNGPCLGYRKPNQ 
PYRWLSYKQVSDRAEYLGSCLLHKGYKSSPDQFVGIFAQNRPEW 
IISELACYTYSMVAVPLYDTLGPEAIVHIVNKADIAMVICDTPQ 
KALVLIGNVEKGFTPSLKVIILMDPFDDDLKQRGEKSGIEILSL 
YDAENLGKEHFRKPVPPSPEDLSVICFTSGTTGDPKGAMITHQN 
IVSNAAAFLKCVEHAYEPTPDDVAISYLPLAHMFERIVQAWYS 
CGARVG F FQGD I RLLADDMKTLKPTLF P AVPRLLNR I YD KVQNE 
AKTPlfKKFLLKLAVSSKFKELQKGI IRHDSFWDKLI FAKIQDSL 
GGRVRVIVTGAAPMSTSVMTFFRAAMGCQVYEAYGQTECTGGCT 
FTLPGDWTSGHVGVPLACNYVKIiEDVADMNYFTVNNEGEVCI KG 
TNVFKGYLKDPEKTQEALDSDGWLHTGDIGRWLPNGTLKIIDRK 
KNIFKLAQGEYIAPEKIENIYNRSQPVLQIFVHGESLRSSLVGV 
WPDTD VLPS FAAKLG VKGS FEELCQNQ WREA I L EDLQKIG KE 
SGLKTFEQVKAIFLHPEPFSIENGLLTPTLKAKRGELSKYFRTQ 
IDSLYEHIQD 


5452 


1833 


1138 


S R VPSLCLS LSLSLS PSREP VAGAPGCX3TAGPPAMATLWGGLLR 
LGSLLSLSCLALSVLLLAQLSDAAKNFEDVRCKCICPPYKENSG 
HIYNKNISQKDCDCLHWEPMPVRGPDVEAYCLRCECKYEERSS 
VTIKVTI II YLSIIX5LLLLYMVYLTLVEPILKRRI/FGHAQLI QS 
DDDI GDHQP FANAHDVLARSRS RANVLNKVE YAQQRWKLQVQ E Q 
RKSVFDRHWLS 


5453 


111 


1520 


PS IPAAVPQSAP PE PHREETVTATATS Q VAQQPP AAAAPGEQAV 
AG P APSTVP S STS KDRP VSQPS LVGS KEEP P PARS G SGGG SAKE 
PQEERSQQQDDIEELETKAVGMSNDGRFLKFDIEIGRGSFKTVY 
KGLDTETTVEVAWCELQDRKLTKSERQRFKEEAEMLKGLQHPNI 
^FYDSWESTVKGKKCIVLVTELMTSGTLKTYLKRFKVMKIKVL 
RS WCRQ I LKGLQFIiHTRTPP I IHRDLKCDNI FI TG PTQSVKIQD 
LGLATLKRASFAKSVIGTPEFMAPEMYEEKYDESVDVYAFGMCM 
LEMATSEYPYSECQNAAQIYRRVTSGVKPASFDKVAIPEVKEII 
EGClRQNKDERYSIKDIiLNHAFFQEETGVRVEIlxAEEDDGEKIAI 
KLWLRIEDIKKLKGKYKDNEAIEFSFDLERNVPEDVAQEMVESG 
YVCEGDHKTMAKAI KDRVSL I KRKREQRQL* 


5454 


111 


1520 


PS I PAAVPQSAP PE PHREETVTATATS QVAQQP PAAAAPGEQAV 
AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKE 
PQE ERSQQQDD I EE LE TKAVGMSNDGR F L KFD I EI GRGS FKTVY 
KGIiDTETTVEVAWCELQDRKLTKSERQRFKEEAEMLKGLQHPNI 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H=Histidine, I=Isoleucine, K- Lysine, 
L»Leucine, M»Methionine, N«=Asparagine, 
PaProline, Q^Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VRFYDSWESTVKGKKCIVLVTELMTSGTLKTYLKRFKVMKIKVL 
RS WCRQ I L KGLQ FLHTRT P P 1 1 HRDLKCDNI F I TG PTGS VKIGD 
LGLATLKRAS FAKS VI GTPEFMAPEMYEEKYDESVDVYAFGMCM 
LEMATSEYPYSECQNAAQ I YRRVTSGVKPAS FDKVAI PEVKEI I 
EGCIRQNKDERYSIKDLLNHAFFQEETGVRVELAEEDDGEKIAI 
KLWLRIEDIKKLKGKYKDNEAIEFSFDLERNVPEDVAQEMVESG 
YVCEGDKKTMAKAI KDR VS LIKR KREQRQL * 


5455 


1359 


. 377 


LTMVS PATRKSLPKVKAMDFITSTAI LPLLFGCLGVFGLFRIiLQ 
WVRGKAYLRNAVWI TGATS GLG KE CAKVF YAAGAKLVLCGRNG 
GALEELIRELTASHATKVQTHKPYLVTFDLTDSGAIVAAAAEIL 
• QCFGYVDI LVNNAG I S YRGT I MDTTVDVD KRVME TNY FGP VALT 
KALLPSMIKRRQGHI VAISS IQGKMS I PFRSA YAA5KHATQAFF 
DCLRAEMEQYE I EVTVI S PG YIHTNLS VNAI TADGS R YG VMDTT 
TAQGRS PVEVAQDVLAAVGKKKKD VI LADLLPSLAVYLRTLAPG 
LFFSLMASRARKERKSKNS 


5456 


2 


2332 


CGAGLVAAGAVLVLYPASRAGERTRVPGS PAPS SLPLHS PGACG 
TEVDMDPQRSPLLEVKGNIELKRPLIKAPSQLPIiSGSRLKRRPD 
QMEDGLEPEKKRTRGLGATTKITTSHPRVPSIjTTVPQTQGQTTA 
OKVS KKTGPRCS TAIATGLKNQK P VPAVP VQKS GTSGVP PMAGG 
KKPSKRPAWDLKGQLCDLNAELKRCRERTQTLDQENQQLQDQLR 
DAQQQVKALGTERTTLEGHLAKVQAQAEQGQQELKNLRACVLEL 
EERLSTQEGLVQELQKKQVELQEERRGLMSQLEEKERRLQTSEA 
ALSSSQAEVASLRQETVAQAALLTEREERLHGLEMERRRLHNQL 
QELKGNIRVFCRVRPVLPGEPTPPPGLLLFPSGPGGPSDPPTRL 
SLSRSDERRGTLSGAPAPPTRHDFSFDRVFPPGSGQDEVFEEIA 
MLVQSALDG YP VCI FAYGQTGSGKTFTMEGGPGGDPQLEGL I PR 
ALRHLFS VAQELSGQGWTYS FVASYVE I YNETVRDLLATGTRKG 
QGGECEIRRAGPGSEELTVTNARYVPVSCEKEVDALLHLARQNR 
AVARTAQNERS SRSHS VFQLQISGEHS SRGLQCGAPLSLVDLAG 
SERLDPGLALGPGERERLRETQAINSSLSTLGLVIMALSNKESH 
VPYRNSKLTYLLQNSLGGSAKMLMFVNISPLEENVSESLNSLRF 
ASKVEPS^FGTAQSNRKWKTDPDLCVCVCVCVCVCVCVCVCVP 
MSMYRVRGGRVAGGCFIGWRAPCPRAIK 


5457 


2 


1540 


DDFVERRRWTRTTCLVRSPPHVPVCGHACSWNGGSLDPLKGTPA 
LLRSAERLMRKVKKLRLDKENTGSWRSFSLNSEGAERMATTGTP 
TADRGDAAATDDPAARFQVQKHSWDGLRSIIHGSRKYSGLIVNK 
APHDFQFVQKTDESGPHSHRLYYLGMPYGSRENSLLYSEIPKKV 
RKEALLLL SWKQMLDHFQATPHHG VYSREEELLRERKRLG VFG I 
TSYDFHSESGLFLFQASNSLFHCRDGGKNGFMVSPGPGCVSPMK 
PLEIKTQCSGPRMDPKICPADPAFFSFINNSDLWVANIETGEER 
RLTFCHQGLSNVLDDPKSAGVATFVI QE EFDRFTGYVJWCPTASW 
EGSEGLKTLRILYEEVDESEVEVIHVPSPALEERKTDSYRYPRT 
GSKNPKIALKLAEFQTDSQGKIVSTQEKELVQPFSSLFPKVEYI 
ARAGWTRDGKYAWAMFLDRPQQWLQLVLLPPALF I PSTENEEQA 
ASLCQS CPQECPAVCGVRGGHQRLDQCS 


545B 


6642 


4022 


FVPGLREPQWEPAQPSATMSAPSEEEEYARLVMEAQPEWLRAEV 
KRLSHELAETTREKIQAAEYGLAVLEEKHQLKLQFEELEVDYEA 
I RSEMEQLKEAFGQAHTNHKKVAADG ES REES L I QES AS KEQ Y Y 
VRKVLELQTELKQLRNVLTNTQS ENERLAS VAQELKE I NQNVE I 
QRGRLRDDIKEYKFREARLLQDYSELEEENISLQKQVSVLRQNQ 
VEFEGLKHEIKRLEEETEYLNSQLEDAIRLKEISERQLEEALET 
LKTERE QKN5 LRKELS H YMS INDS FYTSHLHVSLDGLKFSDDAA 
E PNNDAEALVNG FEHGGLAKLP LDNKTS TP KKEGLAP PSPSLVS 
DLLSELNISEIQKLKQQI^QMEREKAGLLATLQDTQKQLEHTRG 
SLSEQQEKVTRLTENLSALRRLQASKERQTALDNEKDRDSHEDG 
DYYEVD INGPEI LACK YHVAVAE AGELREQ LKALRSTHEAREAQ 
HAEEKGRYEAEGQALTEICVSLLEKASRQDRELLARLBKELKKVS 
DVAGE TQGS LS VAQDELVTF S EE LANLYHHVCMCNNETPNRVML 
D Y YREGQGGAGRTS PGGRTS PEARGRRS P ILLPKGLLAPEAGRA 
DGGTGDS S PS PGS SLPSPLSD PRREPMNI YNL I A 1 1 RDQI KHLQ 
AAVDRTTELSRQRIASQELGPAVDKDKEALMEEILKLKSLLSTK 
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ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


/unmo acxa segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R==Arginine, 
S=Serine, TVThreonine , V»Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








REQ I TTLRTVLKAN KQT AEVALANLKS KYENE KAMVTE TMMKLR 
NELKALKEDAATFSSLRAMFATRCDEYITQLDEMQRQLAAAEDE 
KKTLNSLLRMAI QQKLALTQRLE LLELDHE QTRRGRAKAAPKTK 
PATPSVSKTCACASDRAEGTGIANQVFCSEKHSIYCD 


5459 


316 


1262 


RGGHRLSGMASNFNDIVKQGYVRIRSRRLGIYQRCWLVFKKASS 
KGPKRLEKFSDERAAYFRCTHKVTELNWKNVARLPKSTKKHAI 
G I YFNDDTS KT FACE S DLE AD E WCKVLQME C VGTRIND I S LGE P 
DLLATGVEREQSERFNVYLMPSPNIX3CYMGECALQITYEYICLW 
DVQNPRVKLISWPLSALRRYGRDTTWFTFEAGRMCETGEGLFIF 
QTRDGEAIYQKVHSAALAIAEQHERLLQSVKNSMLQMKMSERAA 
S LS TM VPL PRS A YWQH I TRQHS TGQL YRLQDVS S PLKLHRTETF 
PAYRSEH 


5460 


45 


2097 


RPGCRAGELS TGSRARERVRNRVS APCGQDSRRCDPEVLRGRS P 
GLGLAEMPS CGACTCGAAAVRLI TS SLASAQRGI SGGRIHMSVL 
GRLGTFETQI LQRAPLRS FTETPAY FAS KDGISKDGSGDGNKKS 
ASEGSSKKSGSGNfSGKGGNQLRCPKCGDLCTHVETFVSSTRFVK 
CE KCHH FFWLS EADS KKS 1 1 KE P E S AAEAVKIiAFQQ KP P P PP K 
K I YNYLDKYWGQSFAKKVLS VAVYNHYKR I YNNIPANLRQQAE 
VE KQT S LT PRE LE I RRREDEYRFT KLLQ I AG I S PHGNALGASMQ 
QQVNQQIFQEKRGGEVLDSSHDDIKLEKSNILLLGPTGSGKTLL 
AQTLAKCLDVP FAI CDCTTLTQAG YVGED I ESV I AKLLQDANYN 
VE KAQQG I VFLDE VD K I G S VPG I HQ LRDVGGEG VQQGLLKLLEG 
TIVNVPEKNSRKLRGETVQVDTTNItFVASGAFNGLDRIISRRK 
NEKYLGFGTPSNLGKGRRAAAAADLANRSGESNTHQDIEEKDRL 
LRHVEARDL I E FGMI PE FVGRLP WVPLHS LDE KTLVQ I LTEPR 
NAVI PQYQALFSMDKCELNVTEDALKAIARLALERKTGARGLRS 
IMEKLLLEPMFEVPNSDIVCVEVDKEVVEGKKEPGYIRAPTKES 
SEEEYDSGVEEEGWPRQADAANS 




1481 


160 


rNPPPPPKSPCGRARKWRRRRRPGAPEAAVMELPSGPGPERLFD 
SHRLPGDCFLLLVLLLYAPVGFCLLVLRLFLGIHVFLVSCALPD 
S VLRRF WRTMCAVLGLVARQEDSGLRDHS VRVL I SNHVTP FDH 
NI VNLLTTC S T PLLNS P P S F VCWS RG FMEMNGRGELVE S LKRFC 
ASTRLPPTPLLLFPEEEATNGREGLLRFSSWPFSIQDWQPLTL 
QVQRPLVSVTVSDASWVSELLWSLFVPFTVYQVRWLRPVHRQLG 
EANEE FAIjRVQQLVAKELGQTGTRLTPADKAEHMKRQRH PRLRP 
QSAQS SFPPSPQPS PDVQLATLAQR VKE VL PHVPLG VIQRDLAK 
TGCVDLTITNLLEGAVAFMPEDITKGTQSLPTASASiOFPSSGPV 
TPQPTALTFAKS S WARQES LQER KQAL YEYARRRFTERRAQ EAD 


5462 


663 


3353 


KIKERQMSANNSPPSAQKSVLPTAIPAVLPAASPCSSPKTGLSA 
RLSNGS FSAPS IiTNSRGS VHTVS FLLQI GLTRES VTIEAQE LS L 
S AVKDLVCS I VYQKFPECGFFGM YDKI LLFRHDMNSEN 1 LQL I T 
SADEIHEGDLVEWLSALATVEDFQIRPHTLYVHSYKAPTFCDY 
CGEMLWGIiVRQGLKCEGOJIJmiKRCAFKIPWCSGVRKRRLSW 
VSLPGPGLSVPRPLQPEYVALPSEESHVHQEPSKRIPSWSGRPI 
WMEKMVMCRVKVPHTFAVHS YTRPT I CQYCKRLLKGLFRQGMQC 
KDCKFNCHKRCASJCVPRDCLGEVTFNGEPSSLGTDTDIPMDIDN 
NDINSDSSRGLDDTEEPSPPEDKMFFLDPSDLDVERDEEAVKTI 
S PSTSNNI PLMRWQS I KHTKRKSSTMVKEGWMVHYTSRDNLRK 
RHYWRLDS KCLTLFQNESGS KYYKE IPLSEILRISSP RDFTN I S 
QGSNPHCFE 1 1 TDTMVYFVGENPJGDS SHNP VLAATGVGLD VAQS 
WEKAIRQALMPVTPQASVCTSPGQGKDHKDLSTSISVSNCQIQE 

RFPTKQESQLRNEVAILQNLHHPGIVNLECMFETPERVFWMEK 
LHGDMLEMILSSEKSRLPERITKFMVTQILVALRNLHFKNIVHC 
DLKPENVLIiASAEPFPQVKLCDFGFAR I IGEKSFRRS WGTPAY 
LAPE VLRS KG YNRS LDMWS VG VI 1 YVS LSGT FPFNEDED I NDQ I 
QNAAFM YPPNPWRE I SGEAIDLINNLLQVKMRKRYS VDKShSBP 
WLQD YQTWLDLR E FETRIGER Y I THES DDARWE 1HAYTHNLVY P 
KHFIMAPWPDDMEEDP 


5463 


237 


1012 


LLSVTMTTSRCSHLPEVLPDCTSSAAPWKTVEDCGSLVNGQPQ 
YVMQ VS AKDGQLLS TWRTLATQS PFNDRPMCR I CHE GS S QEDL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 

LO IHoL 

amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
(At=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W= Tryptophan, Y=Tyrosine, X« Unknown, +=Stop 
Codon, / ^possible nucleotide deletion, 
^possible nucleotide insertion) 








LSPCECTGTLGTIHRSCLEHWLSSSNTSYCSLCHFRFAVERKPR 
PL VEWLRN PGPQHE KRTLFGDMVCFL F I TPLATI SG WL CLRGAV 
DHLHFSSRLEAVGL I ALTVALFTI YLFWTLVS FRYHCRLYNEWR 
RTNQRVILLI PKS VNVP SNQ P S LLGLHS VKRNS KET W 


5464 


195 


677 


SPSMNPRKKVDLKL 1 1 VGA IGVGKTSLLHQYVHKTFyEE YQTTL 
GAS ILSKI I ILGDTTLKLQI WDTGGQERVRSMVSTFYKGSDGCI 
LAFD VTDL E S FEALD I WRGDVLAKI VPMEQS Y PMVLLGNKI D LA 
DRKYQSILENHLTESIKLSPDQSRSRCC 


5465 


5278 


3348 


KGD PREF I R VHREALE CD YVS AHLHEW I DL I FG YKQQG P AAVEA 
VNVFHHLFYEGQVD I YNINDPLKETATI GFINNFGQ I PKQLFKK 
PHPPKRVRSRLNGDNAGISVLPGSTSDKIFFHHLDNLRPSLTPV 
KELKEP VGQ I VCTDKG I LAVEQNKVL I PPTWNKTFAWG YADLS C 
RLGTYESDKAMTVYECLS EWGQI LCAI CPNPKLVITGGTSTWC 
VWEMGTS KE KAKTVTLKQAL LGHTDTVTCATAS LAYH 1 1 VS G SR 
DRT C 1 1 WDLNKLS FLTQLRGHRAP VS ALC INELTGD I VS CAGTY 
IHVWSINGNPIVSVNTFTGRSQQIICCCMSEMNEWDTQNVIVTG 
HSDGWRFWRMEFLQVPETPAPEPAEVLEMQEDCPEAQIGQEAQ 
DEDSSDSEADEQSISQDPKDTPSQPSSTSHRPRAASCRATAAWC 
TDSGSDDSRRWSDQLSLDEKDGFIFVNYSEGQTRAHLQGPLSHP 
HPNPIEVRNYSRLKPGYRWERQLVFRSKLTMHTAFDRKDNAHPA 
E VTALG I S KDHS RI L VGDS RGRVFS WS VSDQPGRS AADHW VKDE 
GGDSCSGCSVRFSLTERRHHCRNCGQLFCQKCSRFQSEIKRLKI 
SS PVRVCQNCYYNLQHERGSEDGPRNC 


5466 


3 


992 


HACAHASAHASGRLVRWWRKRRS VMGIQTS P VLLASLGVGLVTL 
LGLAVGSYLVRRSRRPQVTLLDPNEKYLLRLLDKTTVSHNTKRF 
RFALPTAHHTLGLPVGKHIYLSTRIDGSLVIRPYTPVTSDEDQG 
YVDLVIKVYLKGVHPKFPEGGKMSQYLDSLKVGDWEFRGPSGL 
LT YTG KGH FN I Q PNKKS P PEPR VAKKLGM I AGGTG I TPMLQL I R 
AILKVPEDPTQCFLLFANQTEKDIILREDLEELQARYPNRFKLW 
FTLDHPPKDWAYSKGFVTADMIREHLPAPGDDVLVLLCGPPPMV 
QLACHPNLDKLGYSQKMRFTY 


5467 


2103 


4 


GEALRVGTRGCRRDLPDPQARIFIQKKDLEEDESVTAAHLKSRG 
RSPRKIDQFCNSSNMVHGSVTFRDVAIDFSQEEWECLQPDQRTL 
YRDVMLENYSHL I SLAGSS I SKPDVI TLLEQBKEP WMWRKETS 
RR YPDLE LKYG PE KVS PENDTSEVNL P KQV I KQ I S TTLG I EAFY 
FRNDS E YRQ FEGLQGYQEGN I NQKM I S YEKL PTHT P HAS h I CNT 
HKP YE CKECGK YFS CGSNLI QHQS I HTGEKP YKCKECGKAFQLH 
IQLTRHQKFHTGEKTFECKECGKAFnLPTQLNRHKN I hT V KXLF 
ECKECGKSFNRSSNLTQHQSIHAGVKPYQCKECGKAFNRGSNLI 
QHQKIHSNEKPFVCKECGMAFRYHYQLIEHCQIHTGEKPFECKE 
CGKAFTLLTKLVRHQKI HTGE KP FE CRECGKAFS LLNQLNRHKN 
IHTGEKPFECKECGKSFNRSSNLVQHQSIHAGIKPYECKECGKG 
FNRGAHLIQHQKIHSNEKPFVCRECEMAFRYHCQL I EHSRIHTG 
DKPFECQDCGKAFNRGSSLVQHQSIHTGEKPYECKECGKAFRLY 
LQLS QHQ KTHTGEKPFE CKECGKFFRRGSNLNQHRS I HTGKKP F 
ECKE CGKAFRLHMHL IRHQKLHTGEKP FECKE CGKAFRLHMQL I 
RHQKLHTGEKP FEC KECGKVFS L PTQLNRHKN I HTGEKAS 


5468 


225 


2976 


S FLTDL FQSLAQLENLCKQLYETTDTTTRLQAEKALVEFTNSPD 
CLSKCQLLLERGSSSYSQLLAATCLTKLVSRTNNPLPLEQRIDI " 
RNYVLNYLATRPKLATFVTQALIQLYARITKLGWFDCQKDDYVF 
RNAITDVTRFLQDSVEYCI IGVTILSQLTNEINQVSATAFLIEA 

E S QHGLLMQLL KLTHNCLNFDF IGTS TDESSDDLCT VQ I PTS WR 
S AFLDS S TLQLS TI GRCE Y E KTCALLVQLFDQSAQS YQE LLQS A 
SASPMD I AVQEGRLTWLVY I IGAVIGGRVS FASTDEQDAMDGEL 
VCRVLQLMNLTDSRLAQAGNEKLELAMLSFFEQFRKIYIGDQVQ 
KSSKLYRRLS EVLGLNDETMVLS VFIGKI I TiJJLKY WGRCEP I TS 
KTLQLLNDLSIGYSSWKLVKLSAVQFMLNNHTSEHFSFLGINN 
QSNLTDMRCRTTFYTALGRLLMVDLGEDEDQYEQFMLPLTAAFE 
AVAQMFSTNSFNEQEAKRTLVGLVRDLRGIAFAFNAKTSFMMLF 
EW I YPS YM P ILQRAI ELWYHDPACTTP VLKLMAELVHNRSQRLQ 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G^Glycine, 
H=Histidine, I=Isoleucine, K=*Lysine, 
L=Leucine, M»Methionine, N=Asparagine, 
P=Proline, Q«Glutamine , R=Arginine, 
S=Serine, TVThreonine, V^Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








FDVSSPNGILLFRETSKMITMYGNRiLTLGEVPKDQVYALKLKG 
IS I CFSMLKAALSGSYVNFGVFRIjYGDDALDNALQTFI klllsi 
PHSDLLDYPKLSQSYYSLLEVLTQDHMNFIASIjEPHVIMYILSS 
ISEGLTALDTMVCTGCCSCLDHIVTYIiFKQLSRSTKKRTTPLNQ 

esdrflhimqqhpemiqqmlstvlniiifedcrnqwsmsrpllg 

LILLNEKYFSDLRNS IVNSQPPEKQQAMHLCFENLMEGI ERNLL 
TKNRDRFTQNLSAFRREVNDSMKNSTYGVNSNDMMS 


5469 


134 


2653 


dqefetslvpwhlpmgwlcsgllfpvsclvllqvassgnmkvlo 
eptcvsdymsistcewkmngptncstelrllyqlvfllseahtc 
vpennggagcvchllmddwsadnytldlwagqqllwkgsfkps 
ehvkprapgnltvhtnvsdtllltwsnpyppdnylynhltyavn 
1 ws end pad fr 1 ynvtyle pslri aastlksg i s yrarvrawaq 
CYNTTWSEWS pstkwhns yrepfeqhlllgvsvs civi lavcll 

CYVS I TKI KKE WWDQ 1 PNPARS RL VA 1 1 1 Q DAQ G SQ WE KRS RGQ 
EPAKCPHWKNCLTKLLPCFLEHNMKRDEDPHKAAKEMPFQGSGK 
SAWCPVE I S KTVLWPES I S WRCVELFEAPVE CE EEEEVEE E KG 
S FCAS P ES S RDD FQEGREG I VARLTE S LFLDLLG EENGG FCQQD 
MGESCLLPPSGSTSAHMPWDEFPSAGPKEAPPWGKEQPLHLEPS 
PPASPTQSPDNLTCTETPLVIAGNPAYRSPSNSLSQSPCPRELG 
PDPLLARHLEEVE PEMPCVPQLSEPTTVPQPEPETWEQI LRRNV 
LQHGAAAAP VSAPTSGYQEFVHAVEQGGTQASAWGLG P PGEAG 
YKAFSSLLASSAVSPEKCGFGASSGEEGYKPFQDLIPGCPGDPA 
PVPVPLFTFGLDREPPRSPQSSHLPSSSPEHLGLEPGEKVEDMP 
KPPLPQEQATPPLVDSLGSGIVYSALTCHLCGHLKQCHGQEDGG 
QTPVMASPCCGCCCGDRAS P PTTPLRAPDPS PGGVPLEASL CPA 
SLAPSGISEKSKSSSSFHPAPGNAQSSSQTPKIVNFVSVGPTYM 
RVS 


5470 


17 


1418 


TACR I RTSLNRG I AAVKEDAVEMLAS YGLA YS LMKFFTG PMS DF 
KNVGLVFVNS KRDRTKAVLCMWAGAI AAVFHTL I AYSDLGYYI 
INKLHHVDESVGSKTRRAFLYLAAFPFMDAMAWTHAGILLKHKY 
SFLVGCASISDVIAQWFVAILLHSHLECREPLLIPILSLYMGA 
LVRCTTLCLGYYKNIHDI I PDRSGPELGGDATIRKMLSFWWPLA 
LIIiATQRISRP I VNLFVSRDLGGSSAATEAVAI LTATYPVGHMP 
YGWLTE I RAVYP AFDKNNPS NKLVS TSNT VTAAH I KKFT F VCMA 
liSLTLCFVMFWTPNVSE KI L I DI I G VDFAFAELCWPLR I FS FF 
P VP VT VRAHLTG WLMTLKKT FVLAPS S VLRI I VL I AS L WL P YL 
GVHGATLGVGSLLAGFVGESTMDAI AACYVYRKQKKKMENE SAT 
EGEDSAMTDMPPTEEVTDIVEMREENE 


5471 


1868 


658 


RSSAP PGPQRAAAATAAAAAAGVEMAAAAAQGGGGGEPRRTEGV 
GPGVPGE VEMVKGQP FDVGPRYTQLQ Y I GEGAYGMVSS AYDHVR 
KTRVAIKKISPFEHQTYCQRTLREIQILLRFRHENVIGIRDILR 
ASTLBAMRDVYIVQDLMETDLYKLLKSQQLSNDHICYFLYQILR 
GLKYIHSANVLHRDLKPSNLLINTTCDLKICDFGLARIADPEHD 
HTGFLTEYVATRWYRAPEIMLNSKGYTKSIDIWSVGCILAEMLS 
NRPIFPGKHYLDQLNHIIjGILGSPSQEDLNCIINMKARNYLQSL 
P S KTKVAWAKLF P KSDS KALDLLDRMLT FNPNKR I T VEEALAH P 
YLEQ YYDPTDEP VAEE P FTFAMELDDL P KERLKE L I FQETARFQ 
PGVLEAP 


5472 


1469 


753 


LYVMARYLSDEEVAVS IDRLCKANGRSPS I P FGTVRI PGRARVR 
DPQALWIFGYGSLVWRPDFAYSDSRVGFVRGYSRRFWQGDTFHR 
GSD KM PGR WTLLE DHEGCT WGVAYQVQG E Q VS KALKYLNVREA 

VUjuIJJI j\±»V Xr i lr\juJ\trU\J PIjKAIjAYVATPQNPGYLiGPAPEEA 

I ATQ I LACRGFS GHNLE YLLRVRD VMQL CG PQAQDEHIiAAI VDA 
VGTMLPCFCPTEQALALV 


54 73 


3 


2119 


FMNVKLLIQDLEDIEQRVPVMDAQYKIITKTAHLITKESPQEEG 
KEMFATMSKJjKEQLTKVKECYSPLLYESQQLLIPLEELEKQMTS 
FYDS LGK INE 1 1 TVLERE AQS S ALFKQKHQELLACQENC KKTLT 
LIEKGSQSVQKFVTLSNVLKHFDQTRLQRQIADIHVAFQ3MVKK 
TGDWKKHVETWSRI^KKFEESRAELEKVLRIAQEGLEEKGDPEE 
LLRRHTEFFSQLDQRVLNAFLKACDELTDILPEQEQQGLQEAVR 
KLHKQWKDLQGEAPYHLLHLKIDVEKNRFLASAEECRTELDRET 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


-| — — -g-i — — 3 5 

Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine, G»Glycine, 
HsHistidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, Apossible nucleotide deletion, 
\=possible nucleotide insertion) 








KLMPQEG S EK 1 1 KEHR VF FSDKG PHHLCE KRLQLI EELC VKLP V 
RDPVRDTPGTCHVTLKELRAAIDSTYRKLMEDPDKWKDYTSRFS 
E FSS W I STNETQLKG I KGEAI DTANHGEVKRAVE E IRNGVTKRG 
ETLS WL KS RLKVLTE VSS ENEAQ KQGDE LAKLS S S FKALVTLLS 
EVEKMLSNFGDCVQYKEIVKNSLEELISGSKEVQEQAEKILDTE 
NLFEAQQLLLHHQQKTKRISAKKRDVQQQIAQAQQGEGGLPDRG 
HE ELR KLESTLDGLERSRERQERR I Q VTLRKWERFETNKETWR 
YLFQTGSSHERFLSFSSLESLSSELEQTKEFSKRTESIAVQAEN 
LVKEASEIPLGPQNKQLLQQQAKSIKEQVKKLEDTLEEEYVIDK 
S 




2 


780 


TP DVRQLQASRRGIAVAS WCSPRW FAGEEMAFVKSGWLLRQST I 
LKRWKKNWFDLWSDGHL I YYDDQTRQNI EDKVHMPMDC INI RTG 
QECRDTQPPDGKSKDCMLQIVCRDGKTISLCAESTDDCIiAWKFT 
LQ D S RTNTAY VG S AVMTD E TS WS S P P PYTAYAAPAPEVGRTLS 
LQ Q AYG YGP YGGAYPPGTQ W YAANGQAYAVP YQ YP YAGLYGQQ 
PANQVI I RER YRDNDS DLALGMLAGAATGMALGS LF WVF 


5475 


2 


506 


ARGWLESLSLTCQTTPPPSSPCLLHSPETFIHTMPPNLTGYYRF 
VSQKNMEDYLQALNISIAVRKIALLLKPDKEIEHQGNHMTVRTL 
S T FRNYT VQ F D VGVEF E EDLRS VDGRKCQT IVTWEEEHLVCVQK 
GEVPNRGWRHWLEGEMLYLELTARDAVCEQVFRKVR 


5476 


192 


1457 


S DS MS L LDCF CTSRTQVE S LRPEKQSETSI HQ YL VDE PTLS WS R 
PSTRAS E VLCSTNVSHYELQVE IGRGFDNLTSVHLARHTPTGTL 
VTIKITNLENCNEERLKALQKAVILSHFFRHPNITTYWTVFTVG 
S WLWVI S PFMAYGSASQLLRT YFPEGMS ETL IRNI LFGAVRGLN 
YLHQNGC IHRS I KASHILISGDGLVTLS GLSHLHS LVKHGQRHR 
AVYDFPQFSTSVQPWLSPELLRQDLHGYNVKSDIYSVGITACEL 
ASGQVPFQDMHRTQMLLQKLKGPPYSPLDISIFPQSESRMKNSQ 
SGVDSG IGESVLVSSGTHTVNSDRLHTPSSKTFS PAFFSLVQLC 
LQQDPEKRPSASSLLSHVFFKQMKEESQDSILSLLPPAYNKPSI 
SLPPVLPWTEPECDFPDEKDSYWEF 


5477 


3 


1044 


RGNSRLRYSHEDELQLPRLPELFETGRQLLDEVEVATEPAGSRI 
VQEKVFKGLDLLEKAAEMLSQLDLFSRNEDLEEIASTDLKYLLV 
PAFQGALTMKQ VN PS KRLDHLQRAREHF INYLTQCHC YHVAE FE 
LPKTMNNSAENHTANSSMAYPSLVAMASQRQAKIQRYKQKKELE 
HRLSAMKSAVESGQADDERVRE YYLLHLQRWIDI SLEE IES I DQ 
E I KILRERDSSREAS TSNSSRQERPPVKPF ILTRNMAQAKVFGA 
G YPSLPTMTVSDW YEQHRKYGALPDQG I AKAAPEE FRKAAQQQE 
EQEEKEEEDDEQTLHRAREWDDWKDTHPRGYGNRQNMG 


5478 


2 


835 


KTVR I WVPNVKGES TVFRAHTAT VRS VHFCSDGQS F VTASDD KT ' 
VKVWATHRQKFLFS LS QH INW VRCAKFS PDGRLI VSASDDKT VK 
LWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVW 
DVRTHRLLQHYQLHSAAVNGLS FHPSGN YL ITASSDSTLKILDL 
MEGRLL YTLHGHQG PATTVAFS RTGE YFAS GGSD EQVMVWKSNF 
D IGDHGEVTKVPRP PATLAS SMGNLTVS ILEQRLTLEEDKLKQC 
LENQQLIMQRATP 


5479 


2 


835 


KTVRIWVPNVKGESTVFRAHTATVRSVHFCSDGQSFVTASDDKT 
VKVWATHRQKFLFSLSQHINWVRCAKFS PDGRLIVSASDDKTVK 
LWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVW 
DVRTHRLLQHYQLHSAAVNGLS FHPSGNYLITASSDSTLKILDL 
MEGRLLYTLHGHQGPATTVAFSRTGEYFASGGSDEQVMVWKSNF 

LENQQLIMQRATP 


5480 


444 


1952 


LSLTSRMEEAELVKGRLQAITDKRKIQEEISQKRLKIEEDKLKH 
QHLKKKALREKWLLDG I S SGKEQEEMKKQNQQDQHQIQVLEQS I 
LRLEKEIQDLEKAE LQISTKEEAILKKL KS IERTTEDI IRSVKV 
EREERAEESIEDIYANIPDLPKSYIPSRLRKEINEEKEDDEQNR 
KALYAMEIKVEKDLKTGESTVLSSIPLPSDDFKGTGIKVYDDGQ 
KSVYAVSSNHSAAYNGTDGLAPVEVEELLRQASERNSKS PTEYH 
EPVYANPFYRPTTPQRETVTPGPNFQERIKI KTNGLGI GVNES I 
HNMGNGLSEERGNNFNHISPIPPVPHPRSVIQQAEEKLHTPQKR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^ABparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMTPWEESNVMQDKDAPS PKPRLSPRETI FGKSEHQNSSPTCQE 
DEEDVRYNIVHSLPPDINDTEPVTMIFMGYQQAEDSEEDKKFLT 
G YDG 1 1 HAE LW IDDE EEEDEGE AEKPS YHP IAPHS Q VYQPAK P 
TPLPRKRSEASPHEKHKS 


5481 


3 


1422 


NSPGSVCLCQCVCPSLLHCLPPLLLLLLLPLLliHESPQPPALRV 
VATSSDRNFMNKHQKPVLTGQRFKTRKRDEKEKFEPTVFRDTLV 
QGLNEAGDDL EAVAKFLDSTGSRLDYRRYADTLFD I LVAGSMLA 
PGGTR I DDGDKTKMTNHCVFSANEDHETIRN YAQVFNKL IRR YK 
YLEKAFEDEMKKLLLFLKAFSETEQTKLAMLSGILLGNGTLPAT 
I LTS LFTDS LVKEG IAAS FAVKLFKAWMAE KDANSVTS S LRKAN 
LDKRLLELFPVNRQSVDHFAKYFTDAGLKELSDFLRVQQSLGTR 
KE liQ KE LQ ERLSQB CP I KEVVL YVKEEMKRNDLPE TAVI GLLWT 
CIMNAVEWNKKEELVAEQALKHLKQYAPLLAVFSSQGQSELILL 
QKVQEYC YDNI HFMKAFQFQ WLFYKADVLS EEAI LKWYKEAH V 
AKGKSVFLDQMKKFVEWIiQNAEEESESEGEEN 


5482 


1492 


528 


THWMTGMCYAPHQVLSYINGVTTSKPGVSLVYSMPSRNLSLRL 
EGLQEKDSGPYSCSVWQDKQGKSRGHSIKTLELNVIiVPPAPPS 
CRLQGVPHVGANVTLSCQSPRSKPAVQYQWDRQLPSFQTFFAPA 
LDVIRGSLSh TNhS S S MAG V YVCKAHNEVGTAQCNVTLE VSTG P 
GAAWAGAWGTLVGLGLLAGLVIiLYHRRGKALEEPAND I KEDA 
IAPRTLPWPKSSDTISKNGTLSSVTSARALRPPHGPPRPGALTP 
TPSIiSSQALPSPRLPTTDGAHPQPISPIPGGVSSSGLSRMGAVP 
VMVPAQSQAGSLV 


5483 


1 


788 


FFFFKGCRAGRGNESDYRKIiEEMHQRFLVSERSKDDLQLRLTRA 
ENR I KQLE TDS SEE I SR YQEMI QKLQNVLES ERENCGLVS EQRL 
KLQQENKQLRKETESLRKIALEAQKKAKVKISTMEHEFS I KERG 
FBVQLREMEDSNRNSIVELRHLLATQQKAANRWKEETKKLTESA 
EIRINNLKSELSRQKLHTQELLSQIiEMANEKVAENEKLILEHQE 
KANRLQRRLSQAEERAASASQQLSVITVQRRKAASLMNLENI 


5484 


3 


1997 


IMADMEDLFGSDADSEAERKDSDSGSDSDSDQENAASGSNASGS 
ESDQDERGDSGQPSNKELFGDDSEDEGASHHSGSDNHSERSDNR 
SEASERSDHEDNDPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSE 
AEGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDEERAQGSDEDK 
LQNS DDDE KMQNTDDEERPQ LSDDE RQQ LS EE E KANSDDERP VA 
S DNDDE KQNS DDE EQPQLS DE EKMQNS DDERPQAS DEEHRHS DD 
EEEQDHKSESARGSDSEDEVLRMiCRKNAIASDSEADSDTEVPKD 
NSGTMDLFGGADDISSGSDGEDKPPTPGQPVDENGLPQDQQEEE 
PIPETRIEVEI P KVNTDLGNDL YF VKLPNFLS VE PRPFD P Q Y YE 
DEFBDEEMLDEEGRTRLKLKVENTIRWRIRRDEEGNEIKESNAR 
IVKWSDGSMSLHLGNEVFDVYKAPLQGDHNHLFIRQGTGLQGQA 
VFKTKLTFR PHS TDS ATHRKMTLS LADRCS KTQKIR IL PMAGRD 
PECQRTEMIKKEEERLRASIRRESQQRRMREKQHQRGLSASYLE 
PDRYDEEEEGEESISLAAIKNRYKGGIREERARIYSSDSDEGSE 
EDKAQRLLKAKKLTSDEVRPNLFNSRGLSCTQEPTALNEELTDQ 
AGTN 


5485 


161 


1074 


KRKILSSMMDSEAHEKRPPIliTSSKQDISPHITNVGEMKHYLCG 
CCAAFNNVAITFPIQKVLFRQQLYGIKTRDAILQLRRDGFRWLY 
RGILP PLMQKTTTLALMFGLYEDLS CLLHKHVSAPE FATSGVAA 
VLAGTTEAI FTPLERVQTLLQDHKHHDKFTNTYQAFKALKCHGI 
GEYYRGLVP I LFRNGLSNVLFFGLRGP I KEHLFTATTHSAHLVN 

TIDTPTPT T PAMT y" 1 T7T DDBTWtnTVrnoTAOAT/VDDAODntrtjnAWT 

Ufc iC^tjljLAaAMliGrJUrrPINVVKTOIQS 

WLERDRKL INLFRGAHLN YHRS Ij I SWG 1 1 NAT YEFLLKV I 


548S 


1404 


142 


IPGSTISWSPAAARGLSVCRCCRLHPASAMDLFGDLPEPERSPR 
P AAG KEAQ KGP LL FDDLP PAS S TDSGSGGP LLFDDL P PAS SGDS 
GS LATS I SQMVKTEGKGAKRKTS EEEKNG S E ELVEKKVCKAS S V 
I FGLKGYVAERKGERE EMQDAHV I LND I TEECRP PS SLITRVS Y 
FAVFDGHGGIRAS KFAAQNLHQNLIRKFP KGDVI SVEKTVKRCL 
LDTFKHTDEEFLKQAS S QKPAWKDGSTATCVLAVDNI L YIANLG 
DSRAI LCRYNEESQKHAALS LS KEHNPTQYEERMRI QKAGGNVR 
DGRVLGVLEVSRSIGDGQYKRCGVTSVPDIRRCQLTPNDRFILIi 
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SEQ 
ID 
NO: 


Predict- pd 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 

Gl \l t" Am S c Apt H V Dhorurl rar^ r* r*~\ ■* 4 — _ 

uAUbouu^ rtt.xu, r — irllclly XaldllinB , a=uiycins , 

H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W«Tryptophan, Y=Tyrosine, X=Un)cnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ACDGLFKVFTPEEAVWFILSCLEDBKIQTREGKSAADARYEAAC 
NRLANKAVQRG S ADNVT VMWR I GH 


5487 


535 


182 


«.v ouoyiwijyi trnr V rijrl^b'^h'&Si^Mrlniiv 1 LiAJjLiijJjAQljTA 

LEANDPFANKDDPF Y YDWKNLQLS GL I CGGL LAI AG I AAVLSG K 
CKCKSSQKQHSPVPEKAIPLITPGSATTC 


5488 


1072 


259 


AMAASGEP QRQWQEE VAAVWVGS CMTDL VSLTS RL P KTGE TI H ' 
GHKF FIG FGG KGANQ C VQAARLGAMTSMVCKVGKDS FGND Y I EN 
LKQNDISTEFTYQTKDAATGTAS I I VNNEGQNI I VIVAGANLLL 
NTEDLRAAANVI S RAKVMVCQLE I T P ATS LEALTMARRS GVKTL 
rri f Air'i\j_ftUjjUi , Uf * A iioiJ VFLCNESEAE I LTGLT VGSAADAGE 

AALVLLKRGCQWIITLGAEGCWLSQTEPEPKHIPTEKVKAVD 
TTVSFKI 


5489 


81 


893 


GKGP VAAF I DQSNI FLTDPK I FLGQ WREE PKMPLLLLGE TE PLK 
LERDCRSPVEPWAAASPDLALACLCHCQDLSSGAFPNRGVLGGV 
LFPTVEMVI KVFVATSSGS I AIRKKQQE WGFLEANKI DFKELD 
I AGDEDNRR WMRENVPGE KKPQNG I PL P PQI FNEEQ YCGDFDS F 
FS AKEENI I YS FLGLAP P PDS KGSE KAEEGGE T EAQKEGS ED VG 
NLPEAQEKNEEEGETATEETEEIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5490 


81 


893 


GKGPVAAFIDQSNIFLTDPKIFLGQWREEPKMPLLLLGETEPLK 
LERDCRS PVE PWAAAS PDLALACLCHCQDLS SGAFPNRGVLGGV 
LFPTVEMVI KVFVATSSGS I AIRKKQQEWGFIJSANKIDFKELD 
I AGDEDNRR WMRENVPGE KKPQNG I PL P PQ I FNE EQ YCGDFDS F 
FSAKEENIIYSFLGLAPPPDSKGSEKAEEGGETEAQKEGSEDVG 
iNJjfis/\yje»i\jNisj3c.uc / 1J\1 a r»Tr,a IAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5491 


204 


1194 


GSAPRtiSLGPTGAQARDPDWWARPPSRPYTQSKEDRPDTEGRSE 
QGDMASSFLPAGAITGDSGGELSSGDDSGEVEFPHSPEIEETSC 
LAE LFE KAAAHLQGL I Q VASREQLL YLYAR YKQ VKVGNCNT P KP 
S FFDFEGKQKWEAWKALGDSSPSQAMQE Y IAWKKLDPGWNPQ I 
P EKKG KE ANTGFGGP VI S SLYHEET I RE EDKN I FD YCRENN I DH 
I TKAI KS KNVDVNVKDEEGRALLHWACDRGHKELVT VLLQHRAD 
I NCQDNEGQ TALHYASACEFLD I VELLLQS GADPTLRDQDGCL P 
E E VTGCKT VSLVLQRHTTGKA 


5492 


3 


1B96 


ASKNPLS AVCTTG I MS SLAVRDPAMDRS LRSVFVGNI PYEATEE 
QLKDIFSEVGSWSFRLVYDRETGKPKGYGFCEYQDQETALSAM 
RNLNGREFSGRALRVDNAASEKNKEELKSLGPAAPI IDSPYGDP 
IDPEDAPESITRAVASLPPEQMFELMKQMKLCVQNSHQEARNML 
LQNPQLAYALLQAQ WMRIMDPEIAtiKILHRKIHVTPL I PGKSQ 
SVSVSGPGPGPGPGLCPGPNVLLNQQNPPAPQPQHLARRPVKDI 
PPLMQTPIQGGIPAPGPIPAAVPGAGPGSLTPGGAMQPQLGMPG 
VGPVPLERGQVQMSDPRAPIPRGPVTPGGLPPRGLLGDAPNDPR 
GGTLLSVTGEVEPRGYLGPPHQGPPMHHASGHDTRGPSSHEMRG 
GPLGDPRLLIGEPRGPMIDQRGLPMDGRGGRDSRAMETRAMETE 
VLE TR VMERRGMETCAMETRGMEARGMDARGLEMRG P VP S SRGP 
MTGGIQGPGPINIGAGGPPQGPRQVPGISGVGNPGAGMQGTGIQ 
GTGMQGAGIQGGGMQGAGIQGVSIQGGGIQGGGIQGASKQGGSQ 
PSSFSPGQSQVTPQDQEKAALIMQVLQLTADQIAMLPPEQRQSI 
L I LKEQ I QKS TGAS 


5493 


1 


1876 


RAPMMTKAVPEEPRKPGRLTQALNSPLTWEHVWICVPGGTPDCL 
TDTFR VKR PHLRRSASNGHVPGTP VYRE KEDM YDE 1 1 ELKKSLH 
VQ KSD VDLMRTKLRRLE EENSRKDRQ I EQLLD PS RGTD FVRTLA 
EKR PDASWV INGLKQR I LKLEQQ CKE KDGT I S KLQTDMKTTNLE 
EMRIAMETYYEEVHRLQTLLASSETTGKKPLGEKKTGAKRQKKM 
GSALLSLSRSVQELTEENQSLKEDLDRVLSTSPTISKTQGYVEW 
S KPRLLRR I VELEKKLSVMESSKSHAAEPVRSHP PACLAS SSAL 
HRQ PRGDRNKDHERLRGAVRDLKEER TALQEQLLQRDLEVKQLL 
QAKADLEKELE CAREGE E ERREREE VLREE I QTLTS KLQE LQ EM 
KKEEKEDCPEVPHKAQELPAPTPSSRHCEQDWPPDSSEEGLPRP 
RSPCSDGRRDAAARVLQAQWKVYKHKKKKAVLDEAAVVLQAAFR 



332 



WO 01/53312 



PCT/US00/34263 



SEQ ~ 
ID 
NO: 


beginning 

nucleotide 

location 

corresrionrf 1 r> er 

to first 
amino acid 
residue of 
amino acid 
sequence 


r icunjLcu end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D=Jlspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W«Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GHLTRTKLLASKAHGS EPPS VPGLPDQS SPVPRVPS P I AQATGS 
PVQEEAI VI IQSALRAHLARARHSATGKRTTTAASTRRRS ASAT 
HGDASSPPFLAALPDPSPSGPQAVAPLPGDDVNSDDSDDIVIAP 
SLPTKNFPV 


5494 


71 


536 


RSKAKIGTPTREVPSTDMKVRRESSSSLTHRPAPSPATPRLLGT"' 
RRVLLG VS EGTGCADAM E LVLVFLCS LLAPMVLAS AAE KE KEMD 
PFHYDYQTLRIGGLVFAWLFSVGILLILSRRCKCSFNQKPRAP 
GDEEAQVENL I TANATEPQKAEN 


5495 


273 


2168 


DSLLLIQVDTMPFTLHLRSRLPSAIRSLILQKKPNIRNTSSMAG 
ELRPASLWLPRSLAPAFERFCQVNTGPLPLLGQSEPEKWMLPP 
QGAISETRMGHPQFWKYEFGACTGSLASIiEQYSEQLKDMVAFFL 
GCSFSLEEALEKAGLPRRDPAGHSQAGAYKTTVPCVTHAGFCCP 
LWTMRPI PKDKLEGLVRACCSLGGEQGQPVHMGDPELLG I KEL 
S KP AYGDAM VCP PGEVP VFWPS PLTS LGAVS S CE TPLAFAS I PG 
CTVMTDIiKDAKAPPGCLTPERIPEVHHISQDPLHYSIASVSASQ 
KIRELESMIGIDPGNRGIGHLL.CKDELLKASLSLSHARSVLITT 
G FPTH FNHE P PEE TDGPPGAVALVAFLQALEKE VAI I VDQRAWN 
LHQKI VEDAVEQGVLKTQ I P ILT YQGGS VEAAQAFLCKNGDPQT 
PRFDHLVAIERAGRAADGNYYNARKMNIKHIiVDP IDDLFLAAKK 
IPGISSTGVGDGGNELGMGKVKEAVRRHIRHGDVIACDVEADFA 
VIAGVSNWGGYALACALYILYSCAVHSQYLRKAVGPSRAPGDQA 
WTQALPS VIKEEKMLG ILVQHKVRSGVSG I VGMEVDGLP FHNTH 
AEM I QKIiVD VTTAQV 


5496 


3 


2408 


QDTKMHEIYKGNITPQLNKNTLKTSAATDVWAVYFSQFWIDYEG 
MKSGKGRP IS FVDS FPLS I W I CQPTR YAESQKE PQTCNQVSLNT 
SQSESSDLAGRLKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 
FSPSSSEADIHLLVHVHKHVSMQINHYQYLLLLFLHESLILLSE 
NLRKDVEAVTGS PASQTS ICIGI LLRSAELALLLHP VDQANTLK 
SPVSESVSPWPDYLPTEWGDFLSSKRKQISRDINRIRSVTVNH 
MSDNRSMSVDLSHIPLKDPLLFKSASDTNLQKGISFMDYLSDKH 
LGKISEDESSGLVYKSGSGEIGSETSDKKDSFYTDSSSVLNYRE 
DSN I LS FDS DGNQN I LS S TL TS KGNET I ES I FKAEDLLPEAAS Ij 
SENLDISKEETPPVRTLKSQSSLSGKPKERCPPNLAPLCVSYKN 
MKRS S S QMS LDT I S LDSM I LE EQLLE S DG SDSHM FLE KGNKKNS 
TTN YRGTAES VNAGANLQNYGETS PDAI S TNSEGAQENHDDLMS 
VWFK ITGVNGEI D IRGEDTE I CLQVNQVTPDQLGN I S LRH YLC 
NRPVGSDQKAVIHSKSSPEISLRFESGPGAVIHSLLAEKNGFLQ 
CHIKNFSTE FLTSS LMNIQHFXjEDETVATVMPMKIQVSNTKINL 
KDDS PRS S TVS LEP AP VTVHI DHL WERS DDGS FH I RDSHMLNT 
GNDLKENVKSDSVLLTSGKYDLKKQRSVTQATQTSPGVPWPSQS 
ANFPEFSFDFTREQLMEENESLKQEIiAKAKMALAEAHLEKDAIiL 
HHIKKMTVE 


5497 


1821 


3308 


SISKLLKRRSNIDAYLLSNSCAFFAPRLFSLASQI IREQQSPNV" ' 

CFIYKYSGFPSLECQCHFVSPHSSCYINFFSFPPPFFVCFQLSN 

GFSHYSLSSESHVGPTGAGLFPHCLPASRLLPRVTSVHLPDYAH 

YYTIGPGMFPSSQIPSWKDWAKPGPYDQPLVNTLQRRKEKREPD 

PNGGGPTTASGPPAAAEEAQRPRSMTVSAATRPGEEMEACEELA 

LALSRGLQLDTQRSSRDSLQCSSGYSTQTTTPCCSEDTIPSQVS 

DYDYFSVSGDQEADQQEFDKSSTIPRNSDISQSYRRNFQAKRPA 

S TAGLPTTLGPAMVTPG VATI RRTPSTK PS VRRGT I GAG P I P I K 

TPVIPVKTPTVPDLPGVLPAPPDGPEERGEHSPESPSVGEGPQG 

VTSMPSS MWSGQAS VNP PLPGPKPS 1 PEEHRQAI PES EAEDQER 

EPPSATVSPGQIPESDPADLSPRDTPQGEDMXiNAIRRGVKIjKKT 
TTNDRSAPRFS 


5498 " 


2434 


1492 


ILTHQEIFTGEKPCECGKASIQMSHLSQQKIYSGENPFACKVCG 
KVFSHKSNLTEHEHFHTREKP FECNECGKAFSQKQYV 1 KHQNTH 
TGEKLFE CNECGKS FSQKENLLTHQKIHTGE KPFECKDCGKAF I 
QKSNLI RHQRTHTGEKP F VCKE CGKT FSGKSNLTEHEKI HI GEK 
PFKCSECGTAFGQKKYLIKHQNIHTGEKPYECNECGKAFSQRTS 
LIVHVRIHSGDKPYECNVCGKAFSQSSSLTVHVRSHTGEKPYGC 
NE CG KAFS QFS TliALHLRI HTG KKP YQCS E CGKAFSQKS HH I RH 



333 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N«Asparagine , 
P«Proline, Q«=Glutamine , R=Arginine, 
S=Serine, T= Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKIHTH 


5499 


324 


926 


GFGQIGRGHK1TTYPFSPRKSGRKGMAQSQGWVKRYIKAFCKGF 
FVAVPVAVTFLDRVACVARVEOASMQPSLNPGGSQSSDWLLNH 
WKVRNFEVHRGDIVSLVSPKNPEQKIIKRVIALEGDIVRTIGHK 
NRYVKVPRGH IWVEGDHHGHSFDSNSFGPVSLGLLHAHATHILW 
PPERWQKLES VL P PER LP VQREEE 


5500 


1978 


1286 


KPDWRLQNLPPRLYLWRSSRFGFGHLKKRLQMDFKIEHTWDGFP 
VKHE PVFIRLNPGDRGVMMD I SAPFFRDPPAPLGEPGKP FNELW 
DYEWEAFFL1TDITEQYLEVELCPHGQHLVLLLSGRRNVWKQEL 
PLS FRVS RGETKWEGKAYL PWS YF P PNVTKFNS FAI HGS KDKRS 
YEALYPVPQHBLQQGQKPDFHCLEYFKSFNFWTLIiGEEWKQPBS 
DLWLIEKCDI 


5501 


2927 


2226 


CRP P VS ARVAPGHQGAVGGS GRRP ARVE WDAAAR PSSRPFSLP 
AAIMLALISRLLDWFRSLFWKEEMELTLVGLQYSGKTTFVNVIA 
SGQF S EDMI PTVGFNMRKVT KGNVT I K I WD I GGQ PRFRS MWER Y 
CRG VNAI VYM I DAADREKI EAS RNELHNLLDKPQLQGI P VLVIjG 
NKRDLPNAIJDEKQI.I EKMNLSAIQDRE I CCYS I S CKEKDNI DI T 
LQWLIQHSKSRRS 


5502 


3 


824 


NSAFPVWVPERTALLTCPLGAAPGSSREAPGIAGPPNSTAMSKL 
GKF F KGGGS S KS RAAPS PQ EAL VRLRE TEEMLGKKQ E YLENR I Q 
RE I ALAKKHGTQNKRAALQALKRKKRFEKQLTQ I DGTLSTI E FQ 
REALENSHTNTEVLRNMGFAAKAMKS VHENMDLNKI DDLMQE I T 
EQQDIAQEISEAFSQRVGFGDDFDEDELMAELEELEQEELNKKM 
TNIRLPNVPSSSLPAQPNRKPGMSSTARRSRAASSQRAEEEDDD 
IKQLAAWAT 


5503 


216 . 


654 


KGVRRRGRVRSDSEDSHLGYFKMSFLLPKLTSKKEVDQAIKSTA 
EKVLVLRFGRDEDPVCLQLDDILSKTSSDLSKMAAIYLVDVDQT 
AVYTQYFDI S Y I PSTVFFFNGQHMKVDYGGEDPALRS IKAVRRT 
SPAGTLGEKPVNS 


5504 


58 


3563 


QLS FS FQAP VTFDD I T VYLLQE E WVLLS QQQKELCGSNKL VAP L 
GPTVANPELFRKFGRGPEPWIXJSVQGQRSLLEHHPGKKQMGYMG 
EME VQG PTRE SGQS LP PQKKAYLS HLSTGSGH I EGD W AGRNRKL 
LKPRS IQKS WFVQF PWL IMNEEQTALFCSACRE YPS I RDKRSRL 
I EG YTG P FKVETLKYHAKS KAHMFCVKALAARD P I WAARFRS I R 
DPPGDVLAS PEPLFTADCP IFYPPGPLGGFDSMAELLPSSRAEL 
EDPGGDGAIPAMYLDCISDLRQKEITDGIHSSSDINILYNDAVE 
SCIQDPSAEGLSEEVPWFEELPWFEDVAVYFTREEWGMLDKR 
QKEL YRDVMRMNYE LLAS LG PAAAKPDL I S KLERRAAPWI KDPN 
GP KWGKGRPPGNKKMVAVREADTQASAADSALLPGS PVEARASC 
CSSSICEEGDGPRRIKRTYRPRSIQRSWFGQFPWLVIDPKETKL 
FCS ACI ERPJTLHD KS S RL VRG YTG PFKVETL KYHE VS KAHRL CV 
NTVEIKEDTPHTALVPEISSDLMANMEHFFNAAYSIAYHSRPLN 
DFEKILQLLQSTGTVILGKYRNRTACTQFIKYISETLKREILED 
VRNS PC VS VLLDS S TDAS EQACVG I YIR YFKQMEVKES YITLAP 
L YS ETADG YFET I VS ALDE LD I P FRKPG WWGLG TDGS AMLS CR 
GGLVEKFQEVI PQLLP VHCVAHRLHLAWDACGS I DLVKKCDRH 
IRTVFKFYQSSNKRLNELQEGAAPLEQEIIRLKDLNAVRWVASR 
RRTLHALL VS W PALARHLQR VAE AGGQ I GHRAKGML KLMRGFHF 
VKFCHFLLDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVALES 
LRHQAGPKEEEFNASFKDGRLHGICLDKLEVAEQRFQADRERTV 
LTG I E YLQQRFDADRP P QLKNME VFDTMAW P SG I ELAS FGNDD I 
LNLARYFECSLPTGYSEEALLEEWLGLKTIAQHLPFSMLCKNAL 
AQHCRFPLLSKLMAVVVCVPI STSCCERGFKAMNRIRTDERTKL 
SNEVLNMLMMTAVNGVAVTEYDPQPAIQHWYLTSSGRRFSHVYT 
CAQ VPARS PAS ARLR KEEMGAL YVEE PRTQ KP P I LPS REAAEVL 
KDC IMEPPERLLYPHTSQEAPGMS 


5505 


3312 


1219 


NCSPRSLSAAKMSNRNNNKLPSNLPQIiQNLIKRDPPAYIEEFLQ 
QYNHYKSNVEIFKLQPNKPSKELAELVMFMAQISHCYPEYLSNF 
PQEVKDLLSCNHTVLDPDLRMTFCKALILLRNKNLINPSSLLEL 
FFELFRCHDKLLRKTLYTHIVTDIKNINAKHKNNKVNWLQNFM 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
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location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, C=Cysteine, D^Aspartic Acid, E=* 
Glutamic Acid, F= Phenyl alanine, G»Glycine, 
H=Histidine, I»Isoleucine, K»Lysine, 
L=Leucine, M«Methionine, N*Asparagine, 
PaProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








YTMLRDSNATAAKMSLDVMI ELYRRNI WNDAKTVNVI TTACFS K 
VTKILVAALTFFIX3KDEDEKQDSDSESEDDGPTARDLLVQYATG 
KKSSKNKKKLEKAMKVLKKHRKKKKPEVFNFSAIHLIHDPQDFA 
EKLLKQLECCKERFEVKMMLMNLISRLVGIHELFLFNFYPFLQR 
FLQPHQREVTKILLFAAQASHHIiVPPE I IQSLLMTVANNFVTDK 
NSGEVMTVGINAIKBITARCPLAMTEELLQDLAQYKTHKDKNVM 
MSARTLIHLFRTLNPQMLQKKFRGKPTEAS I EARVQE YGE LDAK 
DYI PGAEVLEVEKEENAENDEDGWES TSLSEEEDADGEWI DVQH 
SSDEEQQEISKKLNSMPMEERKAKAAAISTSRVLTQEDFQKIRM 
AQMRKELDAAPGKSQKRKYIEIDSDEEPRGELLSLRDIERLHKK 
PKSDKETRLATAMAGKTDRKEFVRKKTKTNPFSSSTNKEKKKQK 
NFMMMRYSQNVRSKNKRSFREKQLALRDALLKKKKRMK 


5506 


l 


1531 


FRGDLCGQRGGSAPGEGGSSAWPAPAHPLPEREREREALCPGRS 
CSGGGGEETPGTTPVWSPLEGGGDEELRPNPYVRFPYRWWAVW 
LAAFPS LGAGGETPEAP PES WTQLWFFRFWNAAGYAS FMVPGY 
LLVQYFRRKNYLETGRGLCFPLVKACVFGNEPKASDEVPLAPRT 
EAAE TT PMWQALKLL FCATGLQ VS YLTWG VLQERVMTRS YGATA 
TS PGERFTDSQ FLVLMNRVLAL I VAGLSCVLCKQPRHGAPMYRY 
S FAS L S NVLS S W CQYEALKFVS FP TQVLAKAS KVI P VMLMGKLV 
SRRSYEHWEYLTATLISIGVSMFLLSSGPEPRSSPATTLSGLIL 
LAG YI AFDS FTSN WQDALFA YKMSS VQMMFG VNFFS CLFTVGSL 
LEQGALLEGTRFMGRHSEFAAHALLLSICSACGQLFIFYTIGQF 
GAAVFT I IMTLRQAFAI LLS CLLYGHTVTWGGLGVAWFAALL 
LR V YARGRLKQRG KKAVP VE S PVQKV 


5507 


3704 


1271 


PRGTRRCRPAGRASRRARRRPPCPGPAAPGSLE IGGFGTAAGKK ' 
VAVAD VQ FX3 PMR FHQDQLQ VLLVFTKEDNQCNG FCRACE KAG FK 
CTVTKEAQAVLACFLDKHHD II I IDHRNPRQLDAEALCRS IRSS 
KLS ENTVI VG WRRVDREELSVMP F I SAGFTRR YVENPN IMACY 
NELLQLEFGEVRSQLKLRACNSVFTALENSEDAIEITSEDRFIQ 
YANPAFETTMGYQSGELIGKELGEVPINEKKADLLDTINSCIRI 
GKEWQGI YYAJCKKNGDNIQQNVKI IPVIGQGGKIRHYVS I IRVC 
NGNNKAE KIS ECVQS DTHTDNQTGKH KDRRKG S LDVKAVASRAT 
EVSSQRRHSSMARIHSMTIEAPITKVINIINAAQESSPMPVTEA 
LDR VLE ILRTTEL YS PQFGAKDDDPHANDLVGGLMS DGLRRLSG 
NEYVLSTKNTQMVS SN I ITP I S LDDVPPRIARAMENEEYWDFDI 
FELEAATHNRPLIYLGLKMFARFGICEFLHCSESTLRSWLQIIE 
ANYHS SNP YHNS THS ADVLHATAYFLS KER I KETLDP IDE VAAL 
I AATIHDVDHPGRTNSFLCNAGSELAI L YNDTAVLE SHHAALAF 
QLTTGDDKCNI FKNMERND YRTLRQG 1 1 DMVLATEMTKHFEHVN 
KFVNS INKPLATLEENGETDKNQEVINTMLRTPENRTLI KRMLI 
KCADVSNPCRPLQYCIEWAARISEEYFSQTDEBKQQGLPWMPV 
FDRNTCS I PKSQ IS FIDYFI TDMFDAWDAFVDLPDLMQHLDNNF 
KYWKGLDEMKLRNLRPPPE 


5508 


1151 


£91 


LSSVFSRRSASMFAVGCSMGPFLHYWYLSLDRLFPASGLRGFPN 
VLKKVLVDOLVASPLI/JWYFLGLGCLEGQTVGESCQELREKFW 
EFYKADWCVWPAAQFVNFLFVPPQFRVTYINGLTLGWDTYLSYL 
KYRSPVPLTPPGCVALDTRAD 


5509 


1238 


619 


RKSRGCQNALSASGPAAAAAAIMVRKLKFHEQKLLKQVDFLNWE " 
VTDHNLHE LRVLRR YRLQRREDYTRYNQLS RAVRELARRLRDLP 
ERDQFRVRASAALLDKLYALGLVPTRGSLELCDFVTASSFCRRR 
LPTVLLKLRMAQHLQAAVAFVEQGHVRVGPDWTDPAFLVTRSM 
EDFVTWVDbbKIKRHVIjEYNEERDDFDIiEA ! 


5510 


96 


1195 


PAGAHLS S GS SE PL VE PGRGR VGARVKGERGLQASGS APGRS KM 
AEGERQP P PDSSEE AP PATQNF 1 1 PKKE I HTVPDMGKWKRSQAY 
ADYI G F I LTLNEG VKG KKLT FE YRVS E AI E KLVALLNTLDRW I D 
ETPPVDQPSRFGNKAYRTWYAKLDEEAENLVATWPTHLAAAVP 
EVAVYLKESVGNSTRIDYGTGHEAAFAAFLCCLCKIGVLRVDDQ 
IAI VFKVFNRYLEVMRKLQKT YRMEPAGSQGVWGLDDFQFLP F I 
WGSSQL I DHPYLE PRHFVDE KAVNENHKDYMFLE CI LF I TEMKT 
GPFAEHSNQLWNISAVPSWSKVNQGLIRMYKAECLEKFPVIQHF 
KFGSLLPIHPVTSG 



335 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


rreaicted 
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location 
corresponding 
to first 
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amino acid 
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Predicted end 
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amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D^Aspartic Acid # E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, NeAsparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S» Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5511 


276 


1980 


KLSRVLNLPPENL I TS ISAVP I SQKEEVADFQLS VDS LLEKDND 
HS RPD I Q VQAKRLAE KLRCDTWSE I S TG QRTVNFKI NRE LLTK 
TVLQQVI EDGSKYGLKSELFSGLPQKKI WEFSS PNVAKKFHVG 
HLRSTIIGNFIANLKEALGHQVIRINYLGDWGMQFGLLGTGFQL 
FGYEEKLQSNPLQHLFEVYVQVNKEAADDKSVAKAAQEFFQRLE 
LGDVQAL S LWQKFRD LS I EE Y I RVYKRLG VYFDE YSG ES F YRE K 
SQEVLKLLES KGLLLKTIKGTAWDLSGNGDPSS ICTVMRSDGT 
SLYATRDLAAAIDRMDKYNFDTMIYVTDKGQKKHFQQVFQMLKI 
MG YDWAERCQHVP FGWQGM KTRRGDVTFLEDVLNE I QLRMLQN 
MAS I KTTKE L KNPQETAER VG LAAL 1 1 QD FKG LLLS D YKFS WDR 
VFQS RGDTGVFLQYTHARLHS LEETFGCGYLNDFNTACLQE PQS 
VS I LQHLLR FDE VLYKSSQD FQ PRHI VS YLLTLSHLAA VAHKTL 
QIKDSPPEVAGARLHLFKAVRSVLANGMKLLGITPVCRM 


5512 


120 


1015 


DPSLLLTITVTGVTVLVLVLKSMNSRRREPITXiQDPEAKYPLPIi 
IEKEKI SHNTRRFRFGLPS PDHVLGLP VGNYVQLLAKI DNELW 
RAYTPVSSDDDRGFVDLIIKIYFKNVHPQYPEGGKMTQYIiENMK 
IGETIFFRGPRGRLFYHGPGNLGIRPDQTSEPKKTLADHLGMIA 
GGTG ITPMLQIjI RH ITKD PSDRTRMSLI FANQTEEDI LVRKELE 
EIARTHPDQFDLWYTLDRPPIGWKYSSGFVTADMIKEHLPPPAK 
STLILVCGPPPLIQTAAHPNLEKLGYTQDMI FTY 


5513 


2 


837 


ARWRLPSDSPRIPPAGAETPGRGSCRNYLPSSSPPPPEPSSFPS 
PPTSRGGPGSRDTMSDSEEESQDRQLKIWLGDGASGKTSLTTC 
FAQ ETFGKQ YKQTTGLDFFLRR ITLPGNLNVTLQI WD I GGQT IG 
GKMLDKYIYGAQGVLLVYDITNYQSFSNLEDWYTWKKVSEESE 
TQ PLVALVGNKI DLEHMRTI KPEKHLRFCQENGFSSHF VSAKTG 
DSVFLCFQKVAAEILGIIOjNKAEIEQSQRVVKADIVNYNQEPMS 
RTVNPPRSSMCAVQ 


5514 


1295 

t 


449 


VNRPSWIMGNFRGHALPGTFFFI IGLWWCTKS ILKYICKKQKRT 
CYLGSKTIiFYRLEILEGITIVGMALTGMAGEQFIPGGPHLMLYD 
YKQGHWNQLLGWHHFTMYFFFGLLGVADILCFTI S SLPVS LTKL 
MLSNALFVEAFI FYNHTHGREMLDI FVHQLLVLWFLTGLVAFL 
EFLVRNNVLLELLRSSLILLQGSWFFQIGFVLYPPSGGPAWDLM 
DHENILFLTICFCWHYAVTIVIVGMKYAFITWLVKSRLKRLCSS 
EVGLLKNAEREQESEEEM 




lb J Z 


260 i 


FVRLVGRGDCDPLLSVCLTTMPLYEGLdsdGEKTAWIDLGEAF 
TKCGFAGETGPRCIIPSVIKRAGMPKPVRWQYNINTEELYSYL 
KE F I H I L YFRHLLVNPRDRRWI IE S VLCPSHFRETLTRVLFKY 
FEVPSVLLAPSHLMALLTLGINSAMVIiDCGYRBSLVLPI YEG I P 
VLNCWGALPLGGKALHKELETQLLEQCTVDTSVAKEQSLPSVMG 
SVPEGVLEDIKARTCFVSDLKRGLKIQAAKFNIDGNNERPSPPP 
NVDYPLDGEKILHILGSIRDSWEILFEQDNEEQSVATLILDSL 
IQCPIDTRKQLAENLWIGGTSMLPGFLHRLLAEIRYLVEKPKY 
KKALGT KTFR I HTP PAKANCVAWLGGAI FGALQD I LGS RS VS KE 
YYNQTGRI PDWCSLNNPPLEMMFDVGKTQP PLMKRAFSTEK 


5516 


3 


735 


NSREPPQAGPGPSPRKSPTASSFLFPWRPXiASSFWMGAQGAQES 
I K7^WRVPGTTRRPVTGESPGMHRPEAMIjIiLLTIiALLGGPTWAG 
KMYGPGGGKYFSTTEDYDHEITOLRVSVGLLtiVKSVQVKLGDSW 
D VKLGALGGNTQE VTLQPGE Y I TKVF VAFQAFLRGMVM YTS KDR 
YF YFGKLDGQIS SAYPSQEGQVLVGI YGQYQLLG I KS IGFEWNY 
PLEEPTTEPPVNLTYSANSPVGR 


5517 


246 


499 


SEIYVAMRTDSSKMTnvP^n\TJvTJ'P&<3C!ap&f3PRKiaT onTnocus 
TDGTSDLPLKLEALSVKEBAKEKDEKTTQDQLEKPQNEEK 


5518 


3 


1375 


DAWADAWVRAWDLNMDFPOjWLGLLLPLVAALDFNYHRQEGMEA 
FLKTVAQNYSSVTHLHSIGKSVKGRNLWVLWGRFPKEHRIGI P 
EFKYVANMHGDETVGRELLLHLIDYLVTSDGKDPEITNLINSTR 
IHIMPSMNPDGFEAVKKPDCYYSIGRENYNQYDLNRNFPDAFEY 
mVSRQPETVAWKWLKTETFVLSANLKGGALVAS YPFDNGVQA 
TG ALYSR SLT PDDDVFQ YLAHT YAS RNPNMKKGDE CKNKMNFPN 
GVTNGYSWYPLQGGMQDYNYIWAQCFEITLELSCCKYPREEKLP 
S FWNNNKAS LIE YI KQVHLGVKGQVFDQNGNPLPNVI VE VQDRK 
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SEQ 
ID 
NO: 


rlcUlCUcQ 

beginning 
nucleotide 

lorat i on 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
re s i due o f 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
n«nisuj.uiiic, i&isuieucinc, i\=jjysine, 
L=Leucine, M=Methionine, N=Asparagine, 
Psproline, Q=Glut amine, R^Arginine, 

C — Cpy i np T— TV»T"*aon t no \T— ^ no 

W*Tryptophan, Y«Tyrosine, X -Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








H I CP YRTNK YGEYYLL LLPG S Y 1 1 NVTVPGHD PH I TKV 1 1 PE KS 
QNFSALKKDILLPFQGQLDSIPVSNPSCPMIPLYRNLPDHSAAT 
KPSLFLFLVSLLHIFFK 


5519 


87 


477 


x rwi> AXiiNyy v Hi v y lib WKb J. h, AKbr i MLjiUsbuWDboKAAVAAVVG 
GVYAVGTVLVALSAMGFTS VG IAASS IAAKMMSTAAI ANGGGVA 
AGSLVAILQS VGAAGLS VTSKVIGGFAGTALGAWLGS PPS S 


5520 


117 




Jr l£iV5KUi\vJjK.lr 1 VfKdAljAMl KUbTCXYnr IjVIjSWYTFIjNYYI 
S QEGKDEVKPKILANGARWKYMTLLNLLLQTI F YGVTCLDD VLK 
RTKGGKDIKFLTAFRDLLFTTIiAFPVSTFVFLAFWILFLYNRDL 
IYPKVLDTVIPVWIiNHAMHTFIFPITLAEVVLRPHSYPSKKTCL 
TLLAAASIAYISRILWLYFETGTWVYPVFAKLSLIiGLAAFFSLS 

VlfPTSQIVT T flPVT MUMVtJW PtfATT rtBUnT Cicero Tr^w^NMT»T^r.»r^<-i 

i Vr iAol iljJj^liKLiWnWKWvSVQIljQK 
PAKHQLVKNIR 


5521 


546 


911 


KILNMQKSCEENEGKPQNMPKAEEDRPLEDVPQEAEGNPQPSEE 
GVSQEAEGNPRGGPNQPGQGFKEDTPVRHLDPEEMIRGVDELER 
LREE IRRVRNKFVMMHWKQRHSRSRP YPVCFRP 


5522 


1224 


637 


GSRPLGQRSREKMWVFGYGSLIWKVDFPYQDKLVGYITNYSRRF 
WQGSTDHRGVPGKPGRWTLVEDPAGCVWGVAYRLPVGKEEEVK 
AYLDFREKGG YRTTTVI FYPKDPTTKPFS VLL YIGTCDNPD YLG 
PAPLEDIAEQIFNAAGPSGRNTEYLFELANSIRNLVPEEADEHL 
FALEKLVKERLEGKQNLNCI 




3 


t oon 

1280 


S KGKKRMG S S MS AATARR P VFDD KED VNFDHFQ I LRA I GKG S FG 
KVCI VQKRDTEKMYAMKYMNKQQC I ERDEVRNVFRELE I LQE IE 
HVFLVNLWYSFQDEEDMFMWDLLLGGDLRYHLQQNVQFSEDTV 
RL Y I CEMALALD YLRGQH 1 1 HRDVKPDN I LLD ERGHAHLTD FNI 
ATI IKDGERATALSGTKPYMAPEI FHS FVNGGTGYSFEVDWWSV 
GVMAYELLRGWRPYDIHSSNAVESLVQliFSTVSVQYVPTWSKEM 
VALLRKLLTVWPEHRLSSLQDVQAAPALAGVLWDHLSEKRVEPG 
FVPNKGPJLHCDPTFELEEMILESRPLHKFCKKRLAKNKSRDNSRD 
SSQSENDYLQDCLDAIQQDFVIFNREKLKRSQDLPREPLPAPES 
RDAAEPVEDEAERSALPMCGPICPSAGSG 


5524 


85 


2318 


RERERDHRPGESSQGQSGAGGCFPSPTMELRCGGLLFSSRFDSG 
KLAH VEKVES LS S DGEG VGGGASAIiTSGI AS S PD YE FNVWTR PD 
CAE TE FENGNRS W F YFS VRGGMPG KIi I KI NIMNMNKQS KLYSQG 
MAPFVRTLPTRPRWER I RDRPTFEMTETQFVLS FVHRFVEGRGA 
TTFFAFCYPFSYSDCQELLNQLDQRFPENHPTHSSPLDTIYYHR 
BLLCYSIiDGLRVDLLTXTSCHGLREDREPRLEQLFPDTSTPRPF 
RFAGKRI FFLSSRVHPGETPSS FVFNGFLDFI LRPDDPRAQTIiR 
RLFVFKL I PM LN PDG WRGHYRTDSRGVNLNRQ YLKPDAVLHPA 
IYGAKAVLLYHHVHSRLNSQSSSEHQPSSCLPPDAPVSDLEKAN 
NLQNE AQ CGHS ADRHNAEAWKQTE PAE Q KLNS VW I MPQQS AGLE 
ESAPDTI PP KE S GVAY YVDLHGHASKRG C FMYGNS FS DES TQVE 
NMLYPKLISLNSAHFDFQGCNFSEKNMYARDRRDGQSKEGSGRV 
AI YKAS G X IHS YTLE CN YNTGRS VNS I PAACHDNGRAS PP P PPA 
FPSRYTVEL FEQVGRAMAI AALDMAECNPWPRI VLSEHSS LTKL 
RAWMLKHVRNSRGLSSTLNVG VNKKRGLRTPPKSHNGLP VSCS E 
NTLS RARS FS TGTS AGGSS S S QQNS P QM KNS PS FP FHGSR P AGL 
PGLG S S TQKVTHR VLG P VRGKP VWEP LQHVFGCLGHC WGK 


; 5525 


105 


834 


SNTLDFERHLFIMGQQISDQTQLVINKLPEKVAKHVTL\/RESdS' ' ' 
IiTYE E FLGR VAE LNDVTAKVASGQEKHLL FE VQ PGS DS S AFWKV 
WRWCTKINKS SGI VEASR I MNLVOFTOLYKDITSOAARVT . An 
SSTSEEPDENSSSVTSCQASLWMGRVKQLTDEEBCCICMDGRAD 
LILPCAHSFCQKCIDKWSDRHRNCPI CRLQMTGANESWWSDAP 
TEDDMANYILNMADEAGQPHRP 


5526 


3 


853 


RR PCNPVRAAKRTGAAARAPRGLE VTMLR VAWRTLS L I RTRAVT 
QVLVPGLPGGGSAKFP FNQWGLQ PRSLLLQAARGYWRKPAQS R 
LDODP P PS TLLKD YQNVPG I E KVDDWKRLLSLEMANKKEMLKI 
KQEQ FMKKI VAN P EDTRSLEAR I IALSVKIRSYEEHLEKHRKDK 
AHKRYLLMS IDQRKKMLKNLRNTNYDVFEKI CWGLG I EYTFP PL 
YYRRAHRRFVTKKALCIRVFQETQKLKKRRRALKAAAAAQKQAK 
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SEQ 
ID 
NO: 


DroHi /it-Qf-il 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rreaiuteQ end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M~Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X«unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RRNPDSPAKAIPKTLKDSQ 


5527 


3225 


565 


LLR K YLLHQNPLLLRHQPNRTC I S FS ATMKLKDTKSR PKQS S CG 
KFQTKGIKWGKWKEVKIDPNMPADGQMDDLVCFEELTDYQLVS 
PAKNPSSLFSKEAPKRKAQAVSEEEEEEEGKSSSPKKKIKLKKS 
KNVATEG TS TQ KEFEVKD PELEAQGDDMVCDDP EAGEMTS ENL V 
QTAPKKKKNKGKKGLEPSQSTAAKVPKKAKTWI PEVHDQKADVS 
AWKDLFVPRPVLRALS FLGFSAPTP IQALTLAPAIRDKLDILGA 
AETGSGKTLAFAIPMIHAVLQWQKRNAAPPPSNTEAPPGETRTE 
AGAETRSPGKAEAESDALPDDTVIESEALPSDIAAEARAKTGGT 
VSDQALLFGDDDAGEGPS S L I REKPVPKQNENEEENLDKEQTGN 
LKQELDDKS ATCKAY P KR P LLGLVLTPTRELAVQ VKQH I DAVAR 
FTGIKTAI LVGGMSTQKQQRMLNRRPEI WATPGRLWELI KEKH 
YHLRNLRQLRCLVVDEADRMVEKGHFAELSQLLEMLNDSQYNPK 
RQTLVFSATLTLVHQAPARILHKKHTKKMDKTAKLDLLMQKIGM 
RGKPKVIDLTRNEATVETLTETKIHCETDEKDFYLYYFLMQYPG 
RSLVFANS I S CI KRLSGLLKVLD I MP LTLHACMHQKQRLRNLEQ 
FARLEDCVLLATDVAARGLDIPKVQHVIHYQVPRTSE I YVHRSG 
RTARATNEGL S LML I G P EDVIN FKKI YKTLKXDED I PLFPVQTK 
YMDWKERI RLARQ I E KS E YRNFQACLHNS W I EQAAAALE I ELE 
EDMYKGGKADQQEERRRQKQMKVLKKELRHLLSQPLFTESQKTK 
YPTQSG KP PLLVS AP S KSES ALS CLS KQKKKKTKKPKE PQPEQP 
QPSTSAN 


5528 


3 


895 


GPFLSACRMWGACKVKVHDSLATISITLRRYLRLGATMAKSKFE 
YVRDFEADDTCLAHCW WVRLDGRNFHRFAE KHNFAKPNDS RAL 
QLMTKCAQTVMEELEDIVIAYGQSDEYSFVFKRKTNWFKRRASK 
FMTHVASQFASSYVFYWRDYFEDQPLLYPPGFDGRWVYPSNQT 
LKD YLS WRQADCHINNLYNTVFWAL TQQSGLTP VQAQGRLQGTL 
AADKNE IL FS EFNIN YNNE PPM YR KGTVL I WQKVDEVMTKE I KL 
PTE MEG KKMAVTRTRT KP CKPSHLPRAPCLRWL 


5529 


48 


640 


TFRLVSAHLKTRKLINPEAAERRWRDWDSRQGWLSVKMQRVSGL 
LSWTLSRVLWLSGLSEPGAARQPRIMEEKALEVYDLIRTIRDPE 
KPNTLEELEWSESCVEVQEINEEEYLVIIRFTPTVPHCSLATL 
IGLCLRVKLQRCLPFKHKLEIYISEGTHSTEEDINKQINDKERV 
AAAMENPNLRE I VE QCVLE PD 


5530 


4541 


2606 


AQIVHAISYCHKLHVGHRDLKPENVVFFEKQGLVKLTDFGFSNK 
FQPGKKLTTSCGSLAYSAPEILLGDEYDAPAVDIWSLGVILFML 
VCGQPPFQEANDSETLTMIMDCKYTVPSHVSKECKDLITRMLQR 
DP KRRASLEE I ENHP WLQG VDPS P AT KYNI P L VS Y KNLS E E EHN 
S 1 1 QRMVLGD I ADRJDA I VE ALETNR YNH ITAT Y FLLAER I LREK 
QEKEIQTRSASPSNIKAQFRQSWPTKIDVPQDLEDDLTATPLSH 
ATVPQS P ARAADS VLNGHRS KGLCDS AKKDD L P E LAG P ALS TVP 
PASLKPTASGRKCLFRVEEDEEEDEEDKKPMSLSTQWLRRKPS 
VTNRLTSRKSAPVLNQIFEEGESDDEFDMDENLPPKLSRLKMNI 
no c\j 1 v nKK i nitK K-bUGKGS S Co S S ET5DDDSESRRRLDKDSGF 
TYSWHRRDSSEGPPGSEGDGGGQSKPSNASGGVDKASPSENNAG 
GGSPSSGSGGNPTNTSGTTRRCAGPSNSMQLASRSAGELVESLK 
LMSLCLGSQLHGSTKYIIDPQNGLSFSSVKVQEKSTWKMCISST 
GNAGQVPAVGGIKFFSDHMADTTTELERIKSKNLKNNVLQLPLC 
EKTISVNIQRNPKEGLLCASSPASCCHVI 


5531 


24 


515 


GSQPRAPRPRDSMERPEPELIRQSWRAVSRSPLEHGTVLFARLF 
ALEP DLL PLFQ YNCRQFSS PED CLS S P E FLDH 1 RKVMLV I DAAV 
TNVEDLSSI^EYLASU3RKHRAVGVKLSSFSTVGESLLYMLEKC 
LGPAFTPATRAAWSQLYGAWQAMSRGWDGE 


5532 


3395 


1402 


SDWMWGKRKM 1 1 EDETEFCGEELLHS VLQCKS VFDVLDGE EMR 
RARTRANP YEM 1 RG V FFLNRAAMKMANMDFVFDRM FTNPRD S YG 
KPL VKDREAELL YFADVCAG PGGFS E YVLWRKKWHAKGFGMTL K 
GPNDF KLEDFY S AS S ELFEP Y YGEGG IDGDGD I TRP ENI S AFRN 
FVLDNTDRKGVHFLMADGGFSVEGQENLQEILSKQLLLCQFLMA 
LS IVRTGGHFICKTFDLFTP FS VGLVYLL YCCFBRVCLFKP ITS 
RPANSERYWCKGLKVGIDDVRDYLFAVNIKLNQLRNTDSDVNL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lyeine, 
Ii=Leucine, M=Methionine , NoAsparagine, 
P«Proline, Q«Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








WPLEVIKGDHEFTDYMIRSNESHCSLQIKALAKIHAFVQDTTL 
SEPRQAE IRKECLRLWG I PDQARVAPS SSDPKS KFFEL IQGTEI 
DIFSYKPTLLTSKTLEKIRPVFDYRCMVSGSEQKFLIGLGKSQI 
YTWDGRQSDRW I KLDLKTELPRDTLLS VE IVHELKGEGKAQRKI 
SAIHI IiDVLVLNGTDVREQHFNQRIQIiAEKFVKAVSKPS RPDMN 
PIRVKE VYRLEEMEKI FVRLEMKI I KGSSGTPKLS YTGRDDRHF 
VPMGL Y I VRTVNE PWTMGFSKSFKKKFFYNKKTKDSTFDLPADS 
IAPFHICYYGRLPWEWGDGIRVHDSQKPQDQDKLSKEDVLSFIQ 
MHRA 


5533 


94 


789 


MKERRAPQ P WARC KLVLVGDVQCG KTAMLQVIiAKDCY P ET YVP 
TVFEN YTACL ETEEQRVELS L WDTSG S P YYDNVR P LC YS DS D AV 
LLCFD I S RPETVD S ALKKWRTE I LD YC PSTRVLL I GCKTD LRTD 
LSTLMELSHQKQAPISYEQGCAIAKQLGPEIYLEGSAFTSEKSI 
HSIFRTASMLCLNKPSPLPQKSPVRSLSKRLLHLPSRSELISPT 
FKKEKAKXCS I M 


5534 


3 


605 


L VRGRARAANPGR VGAMDGL RQR VEH FL EQRNL VTE VLGAL EAK 
TGVEKRYLAAGAVTLLS LYLLFGYGASLLCNLIGFVYPAYAS I K 
AI ES PS KDDDTVWLTYWWYALFGLAEF FSDLLLS WFP FY YVGK 
CAFLLFCMAPRPWNGALMLYQRWRPLFLRHHGAVDRIMNDLSG 
RALDAAAGITRNVKPSQTPQPKDK 


5535 


1029 


332 


KSFMDSEARLCSLVELSDTQDETQKSDSENEDLKIDCLQESQEL 
NLQKLKNSERILTEAKQKMRELTVNI KMKEDL I KELI KTGNDAK 
SVSKQYTLKVTKLEHDAEQAK7ELTETQKQLQELENKDLSDVAM 
KVKLQKE FRKKVDAAKLRVQVLQKKQQDS KKLAS LSI ONE KRAN 
E LEQS VDHMKYQ K I QLQRKLQE ENEKRKQLDAV I KRDQQKI KVI 
LSYI PAJCYNMKC 


5536 


942 


282 


AAATAASLSPRGCRLRTPSSDVSPSRAPPPSAAPLPTGRAQMSP " 
S GRLCLLT I VGL I L P TRGQTLKDTTS S S SADAT I MD I QVPTRAP 
DAVYTE LQ PTS PT P TWPADET P QP QTQTQQLBGTDG PLVTDPET 
HKSTKAAHPTDDTTTLSERPSPSTDVQTDPQTLKPSGFHEDDPF 
FYDEHTLRKRGLL VAAVLF I TG 1 1 1 LTS GKCRQLS RLCRNHCR 


5537 


3 

i 


2391 


RARVS S PQLRVFRSGRPRRLRVLR INRTS VALRIiAGTGRFVAKT 
PGHPGSWEMGLLTFRDVAVEFSLEEWEHLEPAQKNLYQDVMLEN 
YRNLVSLGLWSKPDLITFLEQRKEPWNVKSEETVAIQPDVFSH 
YNKDLLTEHCTEASFQKVISRRHGSCDLENLHLRKRWKREECEG 
HNGCYDEKTFKYDQFDESS VESIiFHQQ I XjSSCAKS YNFDQ YRKV 
FTHS S LLNQQEE I D I WGKHH I YDKTS VLFRQVS TLNS YRNVF I G 
E KNYHCNNS EKT LNQS S SPKNHQENYFLEKQ YKCKE F EEVFLQ S 
MHGQBKQEQSYKCNKCVEVCTQSLKHIQHQTIHIRENSYSYNKY 
D KDLSQS SNLR KQ I IHNEEKPYKCEKCX3DSLNHSLHLTQHQI IP 
TEEKP YKWKEQ3KVFNLNCSL YLTKQQQ IDTGENLYKCKACS KS 
FTRSSNLIVHQRIHTGEKPYKCKECGKAFRCSSYLTKHKRIHTG 
EKPYKCKECGKAFNRSSCLTQHQITHTGEKLYKCKVCSKSYARS 
SNLIMHQRVHTGEKP YKCKECGKVFSRS S CLTQHRKIHTGBNLY 
KCKVCAKPFTCFSNLIVHERIHTGEKPYKCKECGKAFPYSSHLI 
RHHRIHTGEKPYKCKACSKSFSDSSGLTVHRRTHTGEKPYTCKE 
CGKAF SYS SDV IQHRR I HTGQR P YKC EE CG KAFN YRS YLTTHQR 
SHTGERPYKCEECGKAF^SRSYLTTHRRRHTGERPYKCDECX5KA 
FSYRSYLTTHRRSHSGERPYKCEECX3KAFNSRSYLIAHQRSHTR 
EKL 


5538 " 


926 


1 £ 1 
lOl 


n o M MM K. 1 r W U a ± 1? V LM huh hhG L I D I S QAQbSCTGPPAIPGIPG 
IPGTPGPDGQPGTPGIKGEKGLPGLAGDHGEFGEKGDPGIPGNP 
GKVGPKGPMGPKGGPGAPGAPGPKGESGDYKATQKIAFSATRTI 
NVPLRRDQTIRFDHVI TNMNNNYEPRSGKFTCKVPGLYYFTYHA 
SSRGNLCWLMRGRERAQKVVTFCDYAYNTFQVTTGGMVLKLEQ 
GENVFLQATDKNSLLGMEGANS I FSGFLLFPDMEA 


S539 


38 


1258 


HRGPSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPG " 
IVDGPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREK 
DEIYGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCS 
SDSFNEDIAAFAKQVRSERPLFSSNPELDNLVIQAIQVLRFHLL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl al an ine , G=Glycine, 
H^HiBtidine, I«Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V= Valine, 
W=Tryptcphan, Y=Tyrosine, X=Unknown, +«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ELE KVHDL CDNFCHR YI TCLKGKMP I DL VI EDRDGGCRED FED Y 
PASCPSLPDQNNMWIRDHEDSGSVHLGTPGPSSGGLASQSGDNS 
SDQGDGLDTS VAS PSSGGEDEDLDQERRRNKKRG I FPKVATNIM 
RAWLFQHLSHPYPSEEQKKQLAQDTGLTILQVNNWFINARRRIV 
QPMI DQSNRTGQGAAFS PEGQP IGGYTETQPHVAVRPPGS VGMS 
LNLEGEWHYL 


5540 


148 


1440 


PPLGAGAGVHARSPHPARRLPLTTAGVGGRAPDLLPTPWRQHRG""" 

PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 

G PAALAS FPETVPAVPG P YGPHRPPQ P L P PGLDS DGLKRE KDE I 

YGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 

FNEDNTAFAKQVRSERPLFSSNPELDNLMIQAIQVLRFHLLELE 

KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDQNNIWIRDHED 

SGSVHLGTP G P S SGGLASQS GDNS SDQGVGLDTS VAS PS S GGED 

EDLDQEPRRNKKRGIFPKVATNIMRAWLFQHLSHPYPSEEQKKQ 

LAQDTGLTI LQVNNWF INARRR I VQPM I DQSNRTGQGAAFS PEG 

QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL 


rr» 

5541 


148 


1440 


P P LGAGAG VHARS P HP ARRL P LTTAGVGGRAPDLLPT PWRQHRG 
PSGAAAPGCALPRGQALEGPRS CRRPQPMARRYDELPHYPG I VD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDEI 
YGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAFAKQVRSERPLFSSNPELDNLMIQAIQVLRFHLLELE 
KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDQNNIWIRDHED 
S GS VHLGTPGPSS GGLAS QSGDNS SDQG VG LDTS VAS PS SGGED 
EDLDQE PRRNKKRGI FPKVATNIMRAWLFQHLSHP YPSEEQKKQ 
LAQDTGLTI LQVNNWF INARRR I VQPM I DQSNRTGQGAAFS PEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5542 


148 


1440 


PPLGAGAGVHARS PHP ARRLPLTTAGVGGRAPDLLPT PWRQHRG 
PSGAAAPGCALPRGQALEGPRS CRRPQPMARRYDELPHYPG I VD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDEI 
YGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAFAKQVRSERPLFSSNPELDNLMIQAIQVLRFHLLELE 
KG KMP I DL VI EDRDGGCREDFED Y PAS CPS LPDQNN I WI RDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRG I FPKVATNIMRAWLFQHLSHP YPSEEQKKQ 
LAQDTGLT I LQVNNWF INARRR I VQPM I DQSNRTGQGAAFS PEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5*43 


2405 


665 


RWVREQPWPLRTSEAVKTPALRPFPGPRGVSPFPKPDWGKSPAP " 
KRPFSDSGAFWSPERRPGVLEAPRRRPVPASFRAVPPKPTRVHG 
SSASRDRVLARTMIVADSECRAELKDYLRFAPGGVGDSGPGEEQ 
RESRARRGPRGPSAFIPVEEVLREGAESLEQHLGLEALMSSGRV 
DNLAWMGLHPDYFTS FWRLHYLLLHTDGPLAS S WRHYI AIMAA 
ARHQCS YLVGSHMAEFLQTGGD P E WLLGLHRAP E KLRKLS E INK 
LLAHRP WL I TKEH I QALLKTGEHTWSLAEL I QAL VLLTHCH S LS 
SFVFGCGILPEGDADGSPAPQAPTPPSEQSSPPSRDPLNNSGGF 
ESARDVEALMERMQQLQESLLRDEGTSQEEMESRFELEKSESLL 
VTPSAD ILE PS PHPDMLCFVEDPT FGYEDFTRRGAQAP PTFRAQ 
DYTWEDHGYSLIQRLYPEGGQLLDEKFQAAYSLTYNTIAMHSGV 
DTSVLRRAI WNYIHCVFG I RYDDYD YGEVNQLLERNLKVY I KTV 
AC YPE KTTRRMYNLFWRH FRHSEKVHVNLLLLEARMQAALL YAL 
RAITRYMT 


5544 


1895 


514 


LGGLLG R QRLLLRMGAGR LGAPME RHGRAS ATS VS S AGEQAAGD 
PEGRRQE PLRRRAS S AS VPAVGAS AEGTRRDRLGS YSG PT S VS R 
QRVES LRKKRPLFPWFGLD IGGTLVKLVYFEPKD I TAEEBEEEV 
ESLKSIRKYLTSNVAYGSTGIRDVHLELKDLTLCGRKGNLHFIR 
FPTHDMPAFIQMGRDKNFSSLHrVFCATGGGAYKFEQDFLTIGD 
LQLCKLDELDCLIKGILYIDSVGFNGRSQCYYFENPADSEKCQK 
LP FDLKNP YPLLLVNIGSGVS ILAVYS KDN YKRVTGTSLGGGTF 
FGLCCLLTGCTTFEEALEMASRGDSTKVDKLVRDIYGGDYBRFG 
LPGWAVAS SFGNMMS KEKREAVS KED LARATL I T I TNNIGS I AR 
MCALNENI NQVVFVGNFLR INT I AMRLLA YALD YWS KG QLKALF 
SEHEG YFGAVGALLELLKI P 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine / D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H«Histidine, I«Isoleucine , K-Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SaSerine, TsThreonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 


5545 


802 


131 


GAMWSAGRGGAAWPVLLGLL.LALLVPGGGAAKTGAELVTCGSVL 
KLLNTHHRVRLHSHDIKYGSGSGQQSVTGVEASDUANSYWRIRG 
GSEGG CPRGS PVRCGQAVRLTHVLTGKNLHTHHFPS PLSNNQEV 
SAFGEDGEGDDLDLWTVRCSGQHWEREAAVRFQHVGTSVFLSVT 
GEQ YGS P I RGQHEVHGMP S ANTHNTW KAMEG I F I KP S VE P S AGH 
DEL 


5546 


1592 


146 


FVPRGGHSS MGQSGRSRHQKRARAQAQLRNLEAYAANPHS FVFT 
RGCTGRNIRQLSLDVRRVMEPLTASRLQVRKKNSLKDCVAVAGP 
LGVTHFLILSKTETNVYFKLMRLPGGPTLTFQVKKYSliVRDWS 
SLRRHRMHEQQFAHPPLLVLNSFGPHGMHVKLMATMFQNljFPSI 
NVHKVNLNT I KR CLL ID YNPDSQELDFRHYS I KWP VGASRGMK 
KLLQE KFPNMS RLQDIS ELLATGAGLSESEAEPDGDHN I TELPQ 
AVAGRGNMRAQQ S AVRLTE I GPRMTLQLI KVQEGVGEGKVMFHS 
FVSKTEEELQAI LEAKE KKLRLKAQRQAQQAQNVQRKQEQREAH 
RKKSLEGMKKARVGGSDEEASGIPSRTASLELGEDDDEQEDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKSPGRKRKRWEMDRGRGRL 
CDQKFPKTKDKSQGAQARRGPRGASRDGGRGRGRGRPGKRVA 


5547 


1592 


146 


FVPRGGHS SMGQSGRSRHQ KRARAQAQ LRNLEAYAANP HS FVFT 
RGCTGRNIRQLSLDVRRVMEPLTASRLQVRKKNSLKDCVAVAGP 
LG VTHFL I LS KTETNVYF KLMRL PGG PTLTFQVKKYSL VRDVVS 
SLRRHRMHEQQFAHPPLLVLNSFGPHGMHVKLMATMFQNLFPSI 
NVHKVNLNTIKRCLL1DYNPDSQELDFRHYSIKWPVGASRGMK 
KLLQEKFPNMSRLQDISELLATGAGIiSESEAEPDGDHNITELPQ 
AVAGRGNMRAQQSAVRLTE I GPRMTLQLIKVQEGVGEGKVMFHS 
FVS KTEEELQAI LEAKE KKLRLKAQRQAQQAQNVQRKQEQREAH 
RKKSLEGMKKARVGGSDEEASGIPSRTASLELGEDDDEQEDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKSPGRKRKRWEMDRGRGRL 
CDQKFPKTKDKSQGAQARRG PRGASRDGGRGRGRGRPGKRVA 


5548 


1 


2153 


DQTGPPETIAFTFPRSTMEPLCPLLLVGFSLPIiARALRGNETTA 
DSNETTTTSGPPDPGASQPLLAWLLLPLLLLLLVLLLAAYFFRF 
RKQRKAWSTSDKKMPNGILEEQEQQRVMLLSRSPSGPKKYFPI 
P VEHLE EE I R 1 RS ADDCKQFRE EFNS L P SGH IQGTFELANKE EN 
REKNRYPNILPNDHSRVILSQLDGIPCSDYINASYIDGYKEXNK 
F I AAQG PKQET VND FWRMVWEQ KSAT I VMLTNLKERKEE KCHQ Y 
WPDQGCWTYGNIRVCVEDCVVLVDYTIRKFCIQPQLPDGCKAPR 
LVSQLHFTSWPDFGVPFTPIGMLKFLKKVKTLNPVHAGPIWHC 
S AGVGRTGTFI VIDAMMAMMHAEQKVD VFEFVS R I RNQR PQMVQ 
TDMQYTF I YQALLE YYLYGDTELDVS S LEKHLQTMHGTTTHFDK 
I GLEE E FRKLTNVR IMKENMRTGNLP ANMKKARV I QI I P YD FNR 
VILSMKRGQEYTDYINASFIDGYRQKDYFIATQGPLAHTVEDFW 
RMI WEWKSHTI VMLTEVQEREQDKCYQ YWPTEGSVTHGE IT I E 1 
KNDTLSEAIS I RDFLVTLNQPQARQEEQVRWRQFHFHGWPE IG 
I PAEG KGM IDL I AAVQKQQQQTGNHP I TVHCS AGAGRTGTF I AL 
SNI LERVKAEGLLD VFQAVKS LRLQR PHM VQTLEQ YE FC YKWQ 
DFIDIFSDYANFK 


5549 


915 


256 


FEATGG KRLAFKMAGTARHDREMA IQAKKKLTTATD P I E RLRLQ 
CLARGSAGIKGLGRVFRIMDDDNNRTLDFKEFMKGLNDYAWME 
KEE VE EL FQRFDKDGNGT I D FNE FLLTLRP PMS RARKEVI MQAF 
RKLDKTGDGVIT I EDLRE VYNAKHHPKYQNGEWSEEQVFRKFLD 
NFDS P YD KDGLVTP EEFMNYYAGVSAS IDTDVYFI IMMRTAWKL 


ccrK 

5550 


2364 


1210 


RKRKVFLKMRRLNRKKTLSLVKELDAFPKVPESYVETSASGGTV 
S L IAFTTMALLT I ME FS VYQDTWMKYE YE VDKDFS S KLR IN I D I 
TVAMKCQYVGADVLDLAETMVASADGLVYEPTVFDLS PQQKEWQ 
RMLQLIQSRLQEEHSLQDVIFKSAFKSTSTALPPREDDSSQSPN 
ACRIHGHLYVNKVAGNFHITVGKAIPHPRGHAHLAALVNHESYN 
FSHR I DHLS FGEL VP AI INP LDGTEK I AI DHNQMFQY F I T WPT 
KLHTYKISADTHQFSVTERERIINHAAGSHGVSGIFMKYDLSSL 
MVTVTEEHMPFWQFFVRLCGIVGGIFSTTGMLHGIGKFIVEIIC 
CRFRLGSYKPVNSVPFEDGHTDNHLPLLENNTH 


5551 


211 


1700 


MQRDHTMD YKE S C P S VS I PS SDEHREKKKR FTVYKVLVS VGRS E 
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SEQ 
ID 
NO: 


rlcUXCCeQ 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=>Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W-Tryptophan, Y«=Tyrosine, X=Unknovn, *=Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 








WFVFRRYAEFDKLYNTLKKQFPAMALKI PAKRI FGDNFDPDFI K 
QRRAGLNEFIQNLVRYPELYNHPDVRAFLQMDSPKHQSDPSEDE 
DERSSQKLHSTSQNINLGPSGNPHAKPTDFDFLKVIGKGSFGKV 
LIAKRKLIX3KFYA\nanjQKKIVLNRKEQKHIMAERlJWiLKNVKH 
P FLVGLH YS FQTTE KLYFVLD FVNGGE LF FHLQRE RS FPEHRAR 
FYAAE IASALGYLHS I KI VYRDL KP EN I LLDSVGHWLTDFGLC 
KEGIAISDTTTTFCGTPEYLAPEVIRKQPYDNTVDWWCLGAVLY 
EMLYGLPPFYCRDVAEMYDNILHKPLSLRPGVSLTAWSILSELL 
EKDRQNRLGAKEDFLEIQNHPFFESLSWADLVQKKIPPPFNPNV 
AGPDDIRNFDTAPTEETVPYSVCVSSDYSIVNASVLEADDAFVG 
FSYAPPSEDLFL 


5552 


2748 


930 


LGPAAGAAMGKKHKKHKAEWRSSYEDYADKPLEKPLKLVLKVGG 
S E VTELSGSGHDS S YYDDRS DHERERH KE KKKKKKKKSE KE KHL 
DDEERRKRKEEKKRKREREHCDTEGEADDFDPGKKVEVEPPPDR 
PVRACRTQPAENESTPIQQLLEHFLRQLQRKDPHGFFAFPVTDA 
IAPGYSMIIKHPMDFGTMKDKIVANEYKSVTEFKADFKLMCDNA 
MTYNRPDTVYYKLAKKILHAGFKMMSKQAALLGNEDTAVEEPVP 
EWPVQVETAKKSKKPSREVISCMFEPEGNACSLTDSTAEEHVL 
ALVEHAADEARDR INRFLPGGKMG YLKRNGDGSLLYS WNTAEP 
DADEESTHPVDLSSLSSKLLPGFTTLGFKDERRNKVTFLSSATT 
ALSMQNNSVFGDLKSDEMELLYSAYGDETGVQCALSLQEFVKDA 
GS YS KKWDDLJjDQ I TGG DHS R TL FQL KQRRNVPM KP PD EAKVG 
DTLGDSSSSVLEFMSMKSYPDVSVDISMLSSLGKVKKELDPDDS 
HLNLDETTKLLQDLHEAQAERGGSR P SSNLSS LSNAS E RDQHHL 
GSPSRLSVGEQPDVTHDPYEFLQSPEPAASAKT 


5553 


74 


1095 


LGREAVYLVS RMDGPVAEHAKQE PFHWTPLLESWALSQ VAGMP ' 
VFLKCENVQ PSGSFKIRGI GHFCQEMAKKG CRHL VCS SGGNAG I 
AAAYAARKLG I PATI VLPESTS LQ WQRLQGEGAEVQLTGKVWD 
EANLRAQELAKRDGWENVP PFDHPLI WKGHAS LVQELKAVLRT P 
PGAL VIiAVGGGG LLAG WAGLLE VG WQH VP 1 1 AMETHGAH CFNA 
A I TAG KLVTLPD I TS VAKS LGAKT VAARALE CMQ VC KI HS EWE 
DTEAVS AVQQ LLD DE RMLVE PACG AALAAI YS GLLRRLQASGCL 
PPS LTS VWI VCGGNNI NSRELQALKTHLGQV 


5554 


166 


2318 


CSGRTGGRGSLRPAENVCLTCKtiSGAETRGLLCPALRTWIMKVL 
GRS F FWVL F P VLP WAVQ AVE HEEVAQR VI KLHRGRGVAAMQS RQ 
WVRDSCRKLSGLLRQKNAVLNKLKTAIGAVEKDVGLSDEEKLFQ 
VHTFE I FQKELNESENSVFQAVYGLQRALQGD YKDWNMKESSR 
QRLEALREAAI KEETEYMELLAAEKHQVEALKNMQHQNQSLSML 
DEILEDVRKAADRLEEEIEEHAFDDNKSVKGVNFEAVLRVEEEE 
ANSKQNTITKREVEDDLGLSMLIDSQNNQYILTKPRDSTIPRADH 
HFIKDIVTIGMLSLPCGWLCTAIGLPTMFGYIICGVLLGPSGLN 
3 1 KS I VQ VETLGE FG VFFTL FL VG LE FS PEKLRKVWKISLQG P C 
YMTLLM I AFGLLWGHLLR I KPTQS VF I S TCLSLS S TPLVSRFLM 
GSARGDKEGDID YS TVLLGMLVTQDVQLGLFMAVM P TL IQAGAS 
ASSSIWEVLRILVLIGQILFSLAAVFLLCLVIKKYLIGPYYRK 
LHMES KGNKE IL I LG I SAF I FLMLTVTELLDVS MELGCFLAGAL 
VSSQG PWTEE I ATS IEP I RDFLAI VFFASIGLHVFPTFVAYEL 
TVLVFLTLSVWMKFLLAALVLSLILPRSSQYIKWIVSAGLAQV 
SEFSFVLGSRARRAGVISREVYLLILSVTTLSLLLAPVLWRAAI 
TRCVPRPERRSSL 


5555 




1425 


LSLRTRETPAPPRCEAASQGRVGWRADAAAEEAVRSVWNRTRDR 

KAYRKLALQLHPDRNPDDPQAQEKFQDLGAAYEVLSDSEKRKQY 
DT YGE EGL KDGHQS S HGDI FS HFFGD FG FM FGGTPRQQDRNT PR 
GSDI I VDLEVTLEEVYAGNFVEWRNKPVARQAPGKRKCNCRQE 
MRTTQLGPGRFQMTQEVGCDECPNVKLVNEERTLEVEIEPGVRD 
GME Y P F I GEGEPHVDGE PGDLR FR I KWKHP I FE RRGDDL YTNV 
TISLVES LVGFEMDITHLDGKKVHISRDKI TRPGAKLWKKGEGL 
PNFDNNNI KGSLI I TFDVDFP KEQLTEEAREGI KQLLKQG$VQK 
VYNGLQGY 


5556 


5835 


3346 


RTRGMSKNCVPMEFEEYLLRMFQGTFYLLQKITKDNNAHTVKSR 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sp ond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysceine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, Idsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S»Serine, T«Threonine , VeValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








le eldes y i e kftd flr lfvs vhlrr i e s ysq fp we fltll f k 
yt fhq ptheg yfs cld i wtlfldylts ki ks rlgdkeavlnr y e 
dalvllltevlnriqfrynqaqleelddetldddqqtewqrylr 
qslewakvmellpthafstlfpvlqdnlevylglqqfivtsgs 
ghrln itaendcrrlh cslrdlss llqavgrlaeyf igdvfaar 
fndalt\tverlvkvtlygsqiklynietavpsvlkpdlidvhaq 
slaalqayshwlaqycsevhrqntqqfvtlisttmdaitplist 
kvqdklllsachllvs lattvr p vfl i s i p a vqkvfnr i tdasa 
lrlvdkaqvlvcralsnilllpwpnlpeneqqwpvrsinhasli 
salsrdyrnlkpsavapqrkmplddtkliihqtlsvlediveni 
sgestksrqicyqslqesvqvslalfpafihqsdvtdemlsffl 
TLFRGIjRVQMGVPFTEQ 1 1 qt flnmftreqlaes ilhegstgcr 
wekflkilqvwqepgqvfkpflpsi ialcmeqvypi iaerps 
pdvkaelfellft^tlhhnwryffkstvlasvqrgiaeeqmenep 
qfsaimqafgqsflqpdihlfkqnlfyletlntkqklyhkkifr 
tamlfqfvnvllqvlvhkshdllqeeigiaiynmasvdfdgffa 
aflpeflts cdgvdanqks vlgrnfkmdrvrrergrakrraewa 
rkpgtcaarrghieasgrglcppcslaaahempadlvl 


5557 


1712 


491 


vilgaglrdkdmwipvvglprrlrlsalagagrfcilgseaatr 
khlparnhcglsdsspqlwpepdfrnpprkaskasldfkryvtd 

RRIiAETLAQIYLGKPSRPPHLLLECNPGPGILTQALLEAGAKW 

alesdktfiphleslgknldgklrvihcdffkldprsggvikpp 
amssrglfknlgieavpwtadiplkwgmfpsrgekralwklay 
dlysctsiykfgrievnmfigekefqklmadpgnpdlyhvlsvi 
wqlaceikvlhmepwssfdiytrkgplenpkrrelldqlqqkly 
LIQMI PRQNLFTKNLTPMNYNI ffhllkhcfgrrs atvidhlrs 
ltpldardilmqigkqedekwnmhpqdfktlfetierskdcay 
kwlydetledr 


5558 


1509 


96 


ragcthpqvpadlgapaeprrpqktcvcllqpqpggqrgpttmi 
tgvfsmrlwtpvgvltslayclhqrrvalaelqeadgqcpvdrs 

LLKLKMVQWFRHGARS PLKPLPLEEQVE WNPQLLEVP PQTQFD 
YTVTNLAGGPKPYSPYDSQYHETTLKGGMFAGQLTKVGMQQMFA 
LGERLRKNYVEDIPFLSPTFNPQEVFIRSTNIFRNLESTRCLLA 
GLFQCQKEGP I I IHTDEADSEVLYPNYQSCWSLRQRTRGRRQTA 
SLQPGISEDLKKVKDRMGIDSSDKVDFFILLDNVAAEQAHNLPS 
CPMLKRFARMIEQRAVDTSLYILPKEDRESLQMAVGPFLHILES 
NLLKAMDSATAPDKIRKLYLYAAHDVTFIPLLMTLGIFDHKWPP 
FAVDLTMELYQHLESKEWFVQLYYHGKEQVPRGCPDGLCPLDMF 
LNAMSVYTLS PEKYHALCSQTQVMEVGNEE 


5559 


ISO 


1983 


PLAATAHFAKMSRVAKYRRQVSEDPDIDSLLETLSPEEMEELEK 
ELDWDPDGSVPVGLRQRNQTEKQSTGVYIIREAMIjNFCEKETKK 
LMQREMSMDESKQVETKTDAKNGEERGRDASKKALGPRRDSDLG 
KEPKRGGLKKSFSRDRDEAGGKSGEKPKEEKIIRGIDKGRVRAA 
VDKKEAGKDGRGEERAVATKKEEEKKGSDRNTGLSRDKDKKREE 
MKEVAKKEDDEKVKGERRNTDTRKEGEKMKRAGGNTDMKKEDEK 
VKRGTGNTDTKKDDEKVKKNEPLHEKEAKDDSKTKTPEKQTPSG 
PTKPSEGPAKVEEEAAPS I FDEPLERVKNNDPEMTE VNVNNSDC 
ITNEILVRFTEALEFNTVVKIiFALANTRADDHVAFAIAIMLKAN 
KT ITSLNLDSNH ITGKG I LAI FRALLQNNTLTELRFHNQRHI CG 
GKTEME I AiaLKENTTLLKI/SYHFELAGPRMTVTWLLSRNMDKQ 
RQKRLQEQRQAQ EAKGE KKDLLE VP KAGAVAKGS PKPSPQPSPK 
PS PKNS P KKGGAPAAP P P P P P PLAP PL I MENLKNS LS PATQRKM 
GDKVLPAQEKNSRDQLLAAIRSSNLKQLKKVEVPKLLQ 


5560 


9 


921 


SSWEFSALSVSMACLSPSQLQKFQQDGFLVLEGFLSAEECVAM 
QQRIGE I VAEMDVPLHCRTEFSTQEEEQLRAQGSTDYFLS SGDK 
IRFFFEKGVFDEKGNFLVPPEKSINKIGHALHAHDPVFKS ITHS 
FKVQTLARSLGLQMPVWQSM YI FKQPH FGGEVS PHQDAS FLYT 
EPLGRVLGVW IAVEDATLENGCLW F I PGSHTSGVSRRMVRAPVG 
SAPGTSFLGSBPARDNSLFVPTPVQRGALVLIHGEWHKSKQNL 
SDRSRQAYTPHLMEASGTTWS PENWLQPTAELP FPQLYT 


5561 


2175 


1775 1 CYFIFQFFSSPYPGLHPHQTPAPLPNPGLYPPPVSMSPGQPPPQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine / C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I^Isoleucine, X=Lysine, 
L^Leucine , M-Me thionine , N=Asparagine , 
P« Proline, Q=Glut amine, R=Arginine, 
S=Serine, T~Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\epossible nucleotide insertion) 








QLLAPTYFSAPGVMNFGNPSYPYAPGALPPPPPPHLYPNTQAPS 
Q V YGG VTY YNPAQQQ VQ P KP S P PRRTPQP VT I KPP PP E WSRGS 
S 


5562 


342 


1385 


SSGKNDMAAAGAAGLVRGLKAGVLSQADYLNLVQCETLEDLKLH 
I^STDYGNFLANEASPLTVSVIDDRLKEKMVVEFRHMRNHAYEP 
LASFLDFITYSYMIDNVILLITGTLHQRSIAELVPKCHPLGSFE 
QMEAVNIAQTPAELYNAILVDTPLAAFFQDCISEQDLDBMNIEI 
IRNTLYKAYLES FYKFCTLLGGTTADAMCP I LEFSADRRAF I IT 
INSFGTELSKEDRAKLFPH03RLYPEGLAQLARADDYEQVKNVA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFEHEVKLNKLAFLNQFHF 
GVFYAFVKLKEQECRNI VWI AECI AQRHRAKIDNY I P I F 


5563 


342 


1385 


SSGKNDMT^AAGAAGLVRGhKAGVLSQADYLNLVQCSTLEDLKLH 
LQSTDYGNFLANEASPLTVSVIDDRLKEKMWEFRHMRNHAYEP 
LASFLDFITYS YMIDNVI LLI TGTLHQRSIAELVP KCHPLGS FE 
QMEAVNIAQTPAELYNAI LVDTPLAAF FQDCI S EQDLDEMNI E I 
IRNTLYKAYLESFYKFCTLLGGTTADAMCPILEFEADRRAFIIT 
INSFGTELSKEDRAKLFPHCGRLYPEGLAQLARADDYEQVKNVA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFEHEVKLNKLAFLNQFHF 
GVFYAFVKLKEQECRNIVWIAECIAQRHRAKIDNYIPIF 


5564 


3 


914 


RVRRDKRAVWTARGRRRCGDSMSGGWMAQVGAWRTGALGLALLL ' 
LLGLGLGLEAAASPLSTPTSAQAAGPSSGSCPPTKFQCRTSGLC 
VPLTWRCDRDLDCSDGSDEEECRIEPCTQKGQCPPPPGLPCPCT 
GVSD CS GGTD KKLRNC S RLACLAGE LRCTLSDDC I P LTWRCDGH 
PDCP DS SDELG CGTNE I Ii P EGDATTMG P P VTLE S VT S LRNATTM 
GPP VTLES VPS VGNATS S S AGDQSGS PTAYG VI AAAAVLS AS L V 
TATLLLLS WLRAQ E RLRP LGL LVAMKE S LLLS EQKTS LP 


5565 


993 


138 


RWNS PNPARAGS IS RPQRAPGSVSAVAMTAAVFFGCAFIAFGPA 
LALYVFTIATEPLRIIFLIAGAFFWLVSLLISSLVWFMARVI1D 
NKDG PTQKYLLI FGAFVS V Y I Q EM FR FAY YKLLKKAS EG LKS I N 
PGETAPSMRLLAYVSGLGFGIMSGVFSFVNTLSDSLGPGTVGIH 
GDS PQ F FL YS AFMTLV 1 1 LLH VFWG I VFFDGCE KKKWG I LL I VL 
LTHLL VS AQT F I S S Y YGI NLAS AFI I L VLMGTWAFLAAGGS C RS 
LKX.CLLCQDKNFLLYNQRSR 


5566 


2043 


1232 


SHIQHHGRGAQAPVKMVSWM I SRAWLVFGMLYPAYYS YKAVKT ' 

KNVKE YVR WMMYWI VFAL YTV I ETVADQTVAWFP L YYELK I AFV 

IWLLSPYTKGASLIYRKFLHPLLSSKEREIDDYIVQAKERGYET 

MVKFGRQGLNLAATAAVTAAVKS QGAI TERLRS FSMHDLTTI Q G 

DEPVGQRPYQPLPEAKKKSKPAPSESAGYGIPIiKDGDEKTDEEA 

EGPYSDNEMLTHKGPRRSQSMKSVKTTKGRKEVRYGSLKYKVKK 

RPQVYF 


5567 


1554 


233 


EFLGSGVSPDLANEDGLTALHQCCIDDFREMVQQLLEAGANINA 
CDSECWTPLHAAATCGHLHLVELLIASGANLLAVNTDGNMPYDL 
CDDEQTLDCLETAMADRGITQDSIEAARAVPELRMLDDIRSRLQ 
AGADLHAP LDHGATliLHVAAANGF SE AAALLLEHRAS LS AKDQD 
GWEPLHAAAYWGQVPLVELLVAHGADLNAKSLMDETPLDVCGDE 
EVRAKLLELKHKHDALLRAQSRQRSLLRRRTSSAGSRGKVVRRV 
S LTQRTDLYRKQHAQE AI VWQQP P PTS P EP PEDNDDRQTGAELR 
P P PPEEDNPEWRPHNGRVGGS PVRHLYSKRLDRS VS YQLSPLD 
STTPHTLVHDKAHHTLADLKRQRAAAKLQRPPPEGPESPETAEP 
GL PGDTVT P Q PDCG FRAGGDP PLLKLTAPAVE AP VERRP CCLLM 


5568 


1731 


COT 

oo / 


AEDRQPAS RRGAGTTAAMAAS GPG CRS WCL CPE VPS ATFFTALL 
S LLVSGPRLFLLQQPLAPSGLTLKS EALRNWQVYRLVT Y I FVYE 
NP ISLLCGAI 1 1 WRFAGNFERTVGTVRHCF FTVI FAI FSAI I FL 
S FE AVS S LS KLGE VEDARGFTP VAFAMLGVTT VRSRMRRALVFG 
MWPSVLVPWLLLGASWLIPQTSFLSNVCGLSIGLAYGLTYCYS 
I DLS ER VAL KLDQTFP FS LMRR I S VFKYVSGS S AERRAAQ SRKL 
NP VPGS YPTQSCHPHLS P SHPVSQTQHASGQKLASWPS CTPGHM 
PTLPPYQPASGLCYVQNHFGPNPTSSSVYPASAGTSLGIQPPTP 
VNSPGTVYSGALGTPGAAGSKESSRVPMP 


5569 


2 


835 


QTPCPLAWERGSRSEDISVPGQKPPTCSSFSGMDVGPSSLPHLG 
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amino acid 
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Predicted end 
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corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CsCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G-Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKLLLLLLLLPLRGQANTGCYGIPGMPGLPGAPGKDGYDGLPGP 
KGE PG I P AI PG I RG PKGQKGE PGL PGHPGKNG P MG PPGM PG VPG 
PMGIPGEPGEEGRYKQKFQSVFTVTRQTHQPPAPNSLIRFNAVL 
TNPQGDYDTSTGKFTCKVPGLYYFVYHASHTANLCVLLYRSGVK 
WTFCGHTSKTNQVNSGG VLLRLQVGEE VWLAVND YYDMVG I QG 
SDSVFSGFLLFPD 


5570 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MSS PS PGKRRMDTDWKL I ES KHE VT I LGGLNEF WKF YG PQGT 
PYEGGVWKVRVDLPDKYPFKSPSIGFMNKIFHPNIDEASGTVCL 
DVINQTWTALYDLTNIFESFDPQLLAYPNPIDPLNGDAAAMYLH 
RPEEYKQKIKEYIQKYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5571 


264 


94 6 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MSS P S PGKRRMDTD WKL 1 E S KHE VT I LGGLNE F WKFYG P QGT 
PYEGGVWKVRVDLPDKYPFKSPSIGFMNKIFHPNIDEASGTVCL 
DVINQTWTALYDLTN1FESFLPQLLAYPNP1DPLNGDAAAMYLH 
RPEEYKQKIKEYIQKYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5572 


2802 


2085 


RTD YRTG I PGRRFRVMAAG DGDVKLGTLGS G S E S SNDGGS E S PG 
DAGAAAEGGGWAAAALALLTGGGEMLLNVALVALVLU3AYRLWV 
R WGKRGLGAGAGAGEES PATS LPRMKKRDFSLEQLRQ YDGSRNP 
R I LLAVNGKV FDVT KGS KF YG PAG P YG I FAGRDAS RGLAT F CLD 
KDALRDEYDDLSDLNAVQMESVREWEMQFKEKYDYVGRLLKPGE 
EPSEYTDEEDTKDHNKQD 


5573 


2562 


219 


VPARTPNAEDQGPEARAATAT PCQSGGRERAGEAAEDGVKMAAF 
SEMGVMPE I AQAVEEMDWLLPTDIQAES I PL I LGGGDVLMAAET 
GSGKTGAFS I PVIQ I VYETLKDQQEGKKGKTT I KTGASVLN KWQ 
MNP YD RGS AFAI GS DGLCCQ S REVKE WHGCRAT KG LMKG KH Y Y E 
VSCHDQGLCRVGWSTMQASLDLGTDKFGFGFGGTGKKSHNKQFD 
NYGEEFTMHDTIGCYLDIDKGHVKFSKNGKDLGLAFEIPPHMKN 
QALFPACVLKNAELKFNFGEEEFKFPPKDGFVALSKAPDGYIVK 
SQHSGNAQVTQTKFLPNAPKALIVEPSRELAEQTLNNIKQFKKY 
I DNPKLRELLI IGGVAARDQLSVLENGVDI WGTPGRLDDLVST 
GKLNLSQVRFLVLDEADGLLSQGYSDFINRMHNQIPQVTSDGKR 
LQV IVCS ATLHS FDVKKLS EKIMH FPTWVDLKGEDSVPDTVHHV 
WPVN PKTDRLWERLGKS H IRTDD VHAKDNTRPGANS PEMWS EA 
IKILKGEYAVRAIKEHKMDQAIIFCRTKIDCDNLEQYFI QQGGG 
PDKKGHQFSCVCLHGDRKPHERKQNLERFKKGDVRFLICTDVAA 
RGIDI HGVPYVINVTLPDEKQNYVHR IGRVGRAERMGLAI SLVA 
TEKE KVW YHVCS SRGKGC YNTRUCEDGGCTI W YNEMQLLS E I EE 
HLNCTISQVEPDIKVPVDEFDGKVTYGQXRAAGGGSYKGHVDIL 
APTVQELAALEKEAQTSFLHLGYLPNQLFRTF 


5574 


1731 


952 


NEGLEVFKEQELQPEDKGAVPEDASTERSAMASLGLQLVGYILG 
LLGLLGTLVAMLLPSWKTS S YVGAS I VTAVGFSKGLWMECATHS 
TGITQCDIYSTLLGLPADIQAAQAMMVTSSAISSLACIISWGM 
RCTVFCQESRAKDRVAVAGGVFFILGGLLGFI PVAWNLHG I LRD 
FYSPLVPDSMKFEIGEALYLGI ISSLFSLIAGI ILCFSCSCQRN 
RSNYYDAYQAQPLATRSSPRPGQPPKVKSEFNSYSLTGYV 


5575 


456 


766 


LLWALP C P P PTAAAVLLS S TGLMELLE KMLALTLAKADS PRTAL 
LCSAWLLTAS FSAQQHKGS LQKDPLLSQACVGCLEALLDYLDAR 
SPDIGRNSPHYLMFP 


5576 | 


249 


2146 


RSWGAPWFWRMRLLRRRHMPLRLAMVGCAFVLFLFLLHRDVSSR 
EEATEKPWLKSLVSRKDHVLDLMLEAMNNLRDSMPKLQIRAPEA 
QQTLFSINQSCLPGFYTPAELKPFWERPPQDPNAPGADGKAFQK 
SKWTPLETQEKEEGYKKHCFNAFASDRISLQRSLGPDTRPPECV 
DQKFRRCPPLATTS VI IVFHNEAWSTLLRT VYS VLHTTPAI LLK 
E 1 1 LVDDAS TE EHLKE KLEQ YVKQLQ WRWRQ EERKGL ITARL 
LGASVAQAEVLT FLDAHCECFHGWLEPLLARI AEDKTVWS PD I 
VTIDLNTFEFAKPVQRGRVHSRGNFDWSLTFGWETLPPHEKQRR 
KDETYP I KS PTFAGGLFS I SKS YFEH IGT YDNQME I WGGENVEM 
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to first 
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sequence 


Predicted end 
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Amino acia segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, Es 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S»Serine, T=Threonine, V*»Valine, 
W-Tryptophan, Y=Tyxroslne, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=pcssible nucleotide insertion) 








S FRVWQ CGGQLE 1 1 PCS WGHVFRTKS PHTFPKGTS VI ARNQVR 
LAEVWMDSYKKIPYRRNLQAAKMAQEKSPGDISERLQLREQLHC 
HNFSWYLHNVYPEMFVPDLTPTF YGAI KNLGTNQCLDVGENMRG 
GKPLIMYSCHGLGGNQYFEYTTQRDLRHNIAKQIjCIiHVSKGAIiG 
LGSCHFTG KNSQ VPKDE EWEIxAQDQL I RNSGSGTCLTS QDKKPA 
MAPCNPSDPHQLWLFV 


5577 


3 


1275 


RNSDCSCGE I SVHCLPWVLF I LDLKVBS S MFCPLKL ILLPVLLD 
YSLGI^DLWSPPELTVHVGDSALMGCVFQSTEDKCIFKIDWTL 
S PGEHAKDE YVL YYYSNLS VP I GRFQNR VHLMGD I L CNDG S LLL 
QD VQEADQGT YI CB I RLKGES Q VFKKA VVLHVL PEEPKELMVHV 
GGL IQMG CVFQSTE VKHVTKVE W I FSGRRAKEE I VFRY YH KLRM 
S VEYSQSWGHFQNRVNLVGDI FRNDGS IMLQGVRESDGGNYTCS 
I HLGNLV F KKT IVLHVS P EEPRTL VTPAALRP LVLGGNQL VI IV 
G I VCAT I LLLPVL I L I VKKTCGNKSS VNS TVLVKNTKKTNPE I K 
EKPCHFERCEGEKHIYSPIIVREVIEEEEPSEKSEATYMTMHPV 
WPSLRSDRNNSLEKKSGGGMPKTQQAF 


5578 


3 


783 


AVESMASPGAGRAPPELPERNCGYREVEYWDQRYQGAADSAPYD 
WFGDFSSFRAIiLEPELRPEDRILVLGCGNSALSYELFLGGFPNV 
TSVDYSSVWAAMQARYAHVPQLRWETMDVRKLDFPSASFDWL 
E KGTLDALLAGERD P WTVSSEGVHTVDQVLSE VSR VLVPGGRF I 
SMTSAAPHFRTRHYAQAYYGWSLRHATYGSGFHFHLYLMHKGGK 
LS VAQLALGAQ ILSP PRPPTS PCFLQDS DHEDFLSAI QL 


5579 


3 


1540 


PJJSGLARGASALARHGGGLAGGVGWDCGACASRCQGVMEGLLTR" 
CRALP ALATCSRQL S G YVP CR FHHCAPRRGRRLbLS RVFQPQNL 
REDRVLS LQD KS DDLTCKS QRLMLQVGL I YPAS PGCYHLLPYTV 
RAMEKLVRVIDQEMQAIGGQKVNMPSLSPAELWQATNRWDLMGK 
ELLRIiRDRHGKEYCLGPTHEEAITALIASQKKLSYKQLPFLLYQ 
VTRKFRDEPRPRFGLLRGREFYMKDMYTFDSSPEAAQQTYSLVC 
DAYCSLFNKLGL PFVKVQADVGT IGGTVSHE FQLPVDIGEDRLA 
I CPRCSFSANMETLDLSQMNCPACQGPLTKTKGI EVGHTFYLGT 
KYS S I FNAQFTNVCGKPTliAEMGC YGLGVTRI LAAAI EVLSTED 
CVRWPSLLAPYQACLI PPKKGSKEQAASELIGQLYDHI TEAVPQ 
LHGEVLLDDRTHIjT IGNRLKDANKFG YP FV 1 1 AGKRALED PAH F 
EVWCQNTGEVAFLTKDGVMDLLTPVQTV 


5580 


1681 


450 


ADAGTRCIPGFWPSGAGYSAPAQRGRRSSGRMRAAAAPGLTAP 
WRLLQCCELEAGEIjGMAVPAAAMGPSALGQSGPGSMAPWCSVSS 
GPSRYVLGMQELFRGHSKTREFLAHSAKVHSVAWSCDGRRIiASG 
SFDKTASVFLLEKDRLVKENNYRGHGDSVDQLCWHPSNPDLFVT 
ASGDKTIR I WDVRTTKC IATVNTKGENINICWS PDGQT IAVGNK 
DDWTFIDAKTHRSKAEEQFKFEVNEISWNNDNNMFFLTNGNGC 
INILSYPELKPVQSINAHPSNCICIKFDPKGKYFATGSADALVS 
LWDVDELVCVRCFSRLDW P VR TLS FSHDG KMLASASEDH F IDIA 
EVETGDKLWEVQCESPTFTVAWHPKRPLLAFACDDKDGKYDSSR 
EAGTVKLFGL PNDS 


"' 5581' 


54 


947 


GGGSGPRAPSATLLDTGESVAAVASGEDKGIAASAAAAAVFACS 
CSPDPQSSTMNPVYSPVQPGAPYGNPKNMAYTGYPTAYPAAAPA 
YNPSLYPTNSPSYAPEFQFLHSAYATLLMKQAWPQNSSSCGTEG 
TFHLPVDTGTENRTYQAS SAAFRYTAGTP YKVP PTQSNTAPP P Y 
SPSPNP YQTAM YP I RS A YPCX2NL YAQGA Y YTQP VYAAQPHVIHH 
TTWQ PNS I PS AI Y PAP VAAPRTNGVAMGMVAGTTMAMS AGTLL 

TTPOWT JV TCI AMP VQMPT YP AOfiTPfc V QWPWW 


5582 


" " 5775 


2739 


I ITNNNNVI I PLVI AYHLSGSAQARGERS PAERLMERQKRKADI 
EKGLQF I QSTLPLKQEEYEAFLLKLVQNLFAEGNDLFRE KDYKQ 
ALVQYMEGLNVADYAASDQVALPRELLCKLHVNRAACYFTMGLY 
EKALEDSEKALGLDSESIPJ^FRKAPJUjNELGRHKEAYECSSRC 
SLALPHDESVTQLGQELAQKLGLRVRKAYKRPQELETFSLLSNG 
TAAGVADQGTSNGLGS IDDI ETDCYVDPRGSPALLPSTPTMPLF 
PHVLDLLAPLDSSRTLPSTDSLDDFSDGDVFGPELDTLLDSLSIj 
VQGGLSGSGVPSELPQLIPVFPGGTPLLPPWGGSIPVSSPLPP 
ASFGLVMDPSKKLAASVLDALDPPGPTLDPLDLLPYSETRLDAL 
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NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
SsSerine, ^Threonine, V«Valine, 
W»Tryptophan, Y=Tyrosine, X«Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DS FGSTRGS LDKPDS FMEETNSQDHRP PSGAQKPAPS PEPCMPN ' 

TALLIKNPLAATHEFKQACQLCYPKTGPRAGDYTYREGLEHKCK 

RDILLGRLRSSEDQTWKRIRPRPTKTSFVGSYYLCKDMINKQDC 

KYGDNCTFAYHQEEIDVWTEERKGTLNRDLLFDPLGGVKRGSLT 

IAKLLKEHQGI FTFLCE I CFDSKPRI ISKGTKDSPSVCSNLAAK 

HSFYNNKCLVHIVRSTSLKYSKIRQFQEHFQFDVCRHEVRYGCL 

REDSCHFAHS FI ELKVWLbQQYSGMTHEDIVQESKKYWQQMEAH 

AGKASSSMGAPRTHGPSTFDLQMKFVCGQCWRNGQVVEPDKDLK 

YCSAKARHCWTKERRVLLVMSKAKRKWVSVRPLPSIRNFPQQYD 

LCTHAQNGR KCQ YVGNCS FAHS PE ERDMWTFMXENK I LDMQQT Y 

DMWL KKHNPG KPGEGT PIS SREGE KQI QM P TD YAD I MMG YHCWL 

CGKNSNSKKQWQQHIQSEKHKEKVFTSDSDASGWAFRFPMGEFR 

LCDRbQKGKACPDGDKCRCAHGQEELNEWLDRREVLKQKliAKAR 

KDMLLCPRDDDFGKYNFLLQEDGDLAGATPEAPAAAATATTGE 


J JDJ 


3 


1265 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEE " 
I KKAYRKLALKYHPDKNPDEGE KFKL I SQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGSPS FSS PMDIFDMFFGGGGRMARERRGKNW 
HQLSVTLEDLYNGVTKKLALQKNVICEKCEGVGGKKGSVEKCPL 
CKGRGMH1HIQQIGPGMVQQIQTVCIECKGQGERXNPKDRCESC 
SGAKVIREKKIIEVHVEKGMKDGQKILFHGEGDQEPELEPGDVI 
I VLDQKDHS VFQRRGHDL IMKMKIQLSEALCGFKKT I KTLDNRI 
LV ITSKAGEVI KHGDLRCVRDEGMP I YKAPLEKGI L I IQFLVI F 
PEKHWLSLEKLPQLEALLP PRQKVR I TDDMDQVELKE FCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 






1265 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEE 
I KKAYRKLALKYHPDKNPDEGEKFKLI SQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGSPSFSSPMDIFDMFFGGGGRMARERRGKNW 
HQLSVTLEDLYmVTKKLALQKNVICEKCEGVGGKKGSVEKCPh 
CKG RGMHIH I QQ I G PGMVQQ I QT VC IECKGQGERINPKDRCESC 
SGAKVIREKKI I E VHVEKGMKDGQKI L FHGEGDQEPELE PGDVI 
I VLDQ KDHS V FQRRGHDL I MKMKI Q LS EALCG F KKT I KTLDNR I 
LVITSKAGEVIKHGDLRCVRDEGMPIYKAPLEKGILIIQFLVIF 
PE KHWLS LEKLPQLEALLP PRQKVR I TDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5585 


2619 


915 


LPAGTPESSLHEALDQCMTALDLFLTNQFSEALSYIjKPRTKESM 
YHSIiTYATILEMQAMMTFDPQDILIiAGNMMKEAQ'MLCQRHRRKS 
SVTDSFSSLVNRPTLGQFTEEEIHAEVCYAKCLLQRAALTFLQD 
ENMVS F I KGG I KVRNS YQTYKELDSLVQS SQYCKGENHPHFEGG 
VKLGVGAFNLTLSMLPTRILRLLEFVGFSGNKDYGLLQLEEGAS 
GHSFRSVLCVMLLLCYHTFLTFVLGTGNVNIEEAEKLLKPYLNR 
YPKGA I FLFLAGR I E VT KGN I DAAI RR FEECCEAQQH WKQ FHHM 
CYWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDE VELFRAVPGLKLK IAGKS L P TEKFAI R KS 
RRYFS SNPI SLPVPALEMM Y I f^NGYAVI GKQPKLTDGILE I ITK 
AEEMLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
I SANE KK I KYDH YL I PNALLELALLLMEQDRNEEA I KLLESAKQ 
NYKNYSMESRTHFRIQAATLQAKSSLENSSRSMVSSVSL 


5586 


2619 


915 


LPAGTPES S LHEALDQCMTALDLFLTNQFS EALS YLKPRTKESM 
YHSLTYAT I LEMQAMMTFDPQD I LLAGNMMKEAQMLCQRHRRKS 
SVTDSFSSLVNRPTLGQFTEEEIHAEVCYAKCLLQRAALTFLQD 
ENMVSFIKGGIKVRNSYQTYKELDSLVQSSQYCKGENHPHFEGG 

GHS FRS VLCVMLLLCYHTFLTFVLGTGNVNI EEAEKLLKPYLNR 
YP KGAI FLFLAGR I E VI KGN I DAAIRRFEECCEAQQHWKQFHHM 
CYWELMWCFTYKGQWKMSYFYADLIiSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RR YFSSNP ISLPVPALEMMYI WNG YAVIGKQPKLTDG I LEI I TK 
AEEMLE KG PENE YS VDDE CLVKLL KGLCLKY LGR VQ EAEENFRS 
IS ANEKK IKYDHYLI PNALLELALLLMEQDRNEEA I KLLESAKQ 
NYKNYSMESRTHFRIQAATLQAKSSLENSSRSMVSSVSL 


5587 


1768 


148 


SS AVPDGAVGR P VAVAVGGPPHS CRCR P CCLMAA I G VHLGCTS A 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S»Serine, T«Threonine, V= Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








CVAVYKDGRAGWANDAGDRVTPAVVAYSENEEIVGLAAKQSRI 
RNISNTVMKVKQILGRSSSDPQAQKYIAESKCLVIEKNGKLRYE 
I DTGE E TKFVNP ED VARL I FS KMKETAHS VLGS DAND W I TVPF 
DPGEKQKNALGEAARAAGFNVLRLIHEPSAALLAYGIGQDS PTO 
KSNI LVFKI/3GTSLSLS VMEVNSG I YRVLSTNTDDNIGGAHFTE 
TLAQYLASEFQRSFKHDVRGNARAMMKLTNSAEVAKHSLSTLGS 
ANCFLDSLYEGQDFDCNVSRARFELLCS PLFNKCI EAIRGLLDQ 
NGFTADD I NKWL CGGSS R I P KLQQLI KDLF PAVELLNS I P PDE 
VI P I GAAI EAG I L IGKENLL VEDSLMI ECS ARD I L VKG VDE SGA 
S RFT VL F PSGTP L P ARRQHTLQAPG S I S S VCLEL YES DG KNS AK 
EETKFAQWLQDLDKKENGLRDILAVLTMKRDGSLHVTCTDQET 
GKCEAISIEIAS 


5588 


3 


589 


T P P P P EQAMVAATVAAAWLLL W AAACAQQ EQD F Y D F KAVN I RGK 
L VSLEKYRGS VSL WNVAS ECG FTDQH YRALQQLQRDLGPHHFN 
VLAFPCNQFGQQEPDSNKEIESFARRTYSVSFPMFSKIAVTGTG 
AHPAFKYLAQTSGKEPTWNFWKYLVAPDGKWGAWDPTVSVEEV 
RPQ I TALVRKLI LLKREDL 


5589 


1884 


553 


LRQAWHEGGIGQTDKERGAAALPGEEGDPTRGRSLGRASWESGS 
PRRPRSPFSSFLPRPICLSLEARPCSIEDRRNWSLIGRPGAPAS 
GLNRSSGLWIiGPDRCRPRSRCSCRVMENPSPAAALGKALCALLL 
ATLGAAGQ P LGG ESI CS ARAPAK YS I T FTGKW S QTAFP KQ YP L F 
R PPAQ WSS LLGAAHSS D YSM WRJCNQ Y VSNGLRD FAERGEA WALM 
KEIEAAGEALQSVHAVFSAPAVPSGTGQTSAELEVQRRHSLVSF 
WRIVPSPDWFVGVDSLDLCDGDRWREQAALDLYPYDAGTDSGF 
TFSSPNFATI PQDTVTE ITS S SPSHPANS FYYPRLKALPP IARV 
T LLRLRQS PRAF I P PAP VL PS RDNE I VDS AS VP ET PLDCE VS LW 
S S WGL CGGHCGRLG TKSRTR YVR VQPANNGS PCPELEEEAE CVP 
DNCV 


5590 


72 


896 


LCSSGALRLLPAM VAWRSAFL VC LAFS LATLVQRGSGDFDD FNL 
EDAVKETS S VKQP WDHTTTTTTNRPGTTRAPAKPPGS GLDLADA 
LDDQDDGRRKPGIGGRERWNHVTTTTKRPVTTRAPANTLGNDFD 
LADALDDRNDRDDGRRKPIAGGGGFSDKDLEDIVGGGEYKPDKG 
KGDGRYG SNDDPG SGMVAEPGT IAGVAS ALAMAL I GAVS S Y I S Y 
QQKKFCFS IQQGLNADYVKGENLEAWCEEPQVKYSTLHTQSAE 
PPPPPEPARI 


5591 


68 


1494 


AGSSRRAAAERLLVSAGCRSLAGRASGVLLLPAELLPGEEEAMA 
LRVTRNS K INAENKAK INMAG AKRVPTAP AATS KPGLR PRTALG 
DIGNKVSEQLQAKMPMKKEAKPSATGKVIDKKLPKPLEKVPMLV 
PVPVSEPVPEPEPEPEPEPVKEEKLSPEPILVDTASPSPMETSG 
CAPAEEDL CQ AFS D VI LA VND VDAEDGAD PNL CS E YVKD I YA Yh 
RQLEEEQAVRPKYLLGREVTGNMRAI LI DWLVQVQMKFRLLQET 
MYMTVSI IDRFMQNNCVPKKMLQLVGVTAMFIASKYEEMYPPE I 
GDFAFVTDNTYTKHQI RQMEMKI LRALNFGLGRPLPLHFLRRAS 
KIGEVDVEQHTLAKYLMELTMLDYDMVHFPPSQIAAGAFCLALK 
I LDNGE WTP TLQHYLS YTEESLL P VMQHLAKNAAMVNQGLTKHM 
TVKNKYATS KHAKI STLPQLNSALVQDLAKAVAKV 


5592 


242 


924 


YGE S KDWNQ KD LLS AL VLTTVNCLPTP IMAKSAEVKLAI FGRAG 
VGKSALWRFLrKRFIWEYDPTLESTYRHQATIDDEWSMEILD 
TAGQEDTIQREGHMRWGEGFVLVYDITDRGSFEEVLPLKNILDE 
I KKPKNVTL I LVGNKADLDHS RQVSTEEGEKIATELACAFYECS 

ACTG FGN I TE I FYF T .PP FVB P P P M\TCl C1Y TP P P Q QTTUVKTHi TW V 

MLTKISS 


5593 


3 


1113 


HASGGRAANMAABRGAGQQQSQEMMEVDRRVESEESGDEEGKKH 
SSGIVADLSEQSLKDGEERGEEDPEEEHELPVDMETINLDRDAE 
DVDLNHYRIGKIEGFEVLKKVKTLCLRQNLIKCIENLEELQSLR 
ELDLYDNQI KKIENLEALTELE ILDISFNLLRNIEGVDKLTRLK 
KLFLVNNKISKIENLSNLHQLQMLELGSNRIRAIENIDTLTNLE 
S LFLGKNK ITittiQNLDALTNLTVLSMQSNRLTK rEGLQNLVNLR 
EL YLS HNG I EVI E GLENNNKLTMLD I ASNRI KKI EN I SHLTELQ 
E FWMNDNLLE S WS DLDE LKGARSLETVYLERNP LQKDPQYRRKV 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, Islsoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine , 
P<=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T»Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MLALPSVRQ I DATFVRF 


5594 


3 


1113 


HASGGRAANMAAERGAGQQQSQEMMEVDRRVESEESGDEEGKKH 
SSGIVADLSEQSLKDGEERGEEDPEEEHELPVDMETINLDRDAE 
DVD LNH YRI GK I E G FE VLKKVKTLCLRQNL I KC I ENLEELQS LR 
ELDLYDNQIKKIENLEALTELEILDISFNLLRNIEGVDKLTRLK 
KLFIiVNNKISKIENJjSNLHQLQMtiEIiGSNRIRAIENIDTLTNLE 
SLFLGKNKITKLQNLDALTNLTVLSMQSNRLTKIEGLQNLVNLR 
ELYLSHNG I EVI EGLENNNKLTMLD I ASNR I KKI ENISHLTELQ 
EFWMNDNLLESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
MLALPSVRQ I DATFVRF 


5595 


3 


1476 


ARWNGRWVQVPAWPGPGCGTNASGER.QRQLPRAWRPVGRTLGSE 
PIAIAWSPPLYLFPIPLPSWAVSQPTPTIX3TMFADLDYDIEEDK 
LG I PTVPGKVTLQ KDAQNL IG I S I GGGAQ YCP CL Y I VQ VFDNTP 
AALDGTVAAGD E I TGVNGRS I KGKTKVE VAKM I QEVKG E VT I HY 
NKLQADP KQGM S LDI VLKKVKHRLVENMS S GTADALGLSRAI LC 
NDGLVKRLEELERTAELYKGMTEHTKNLLRAFYELSQTHRAFGD 
VFSVIGVREPQPAASEAFVKFADAHRSIEKFGIRLLKTIKPMLT 
DLNTYLNKAI PDTRLTI KKYLDVKFEYIjS YCLKVKEMDDEEYSC 
IALGEPLYRVSTGNYEYRL I LRCRQEARARFS QMRKDVLEKMEL 
LDQKHVQDIVFQLQRLVSTMSKYYNDCYAVLRDADVFPIEVDLA 
HTTLAYGLNQEEFTDGEEEEEEEDTAAGEPSRDTRGAAGPLDKG 
GSWCDS 


5596 


698 


219 


GAVLAPSSLPAAELAAQGESQSLEDLSNTSRPTSEVYKISFIFP 
NGDKYDGDCTR'TSSGIYERNGIGIHTTPNGIVYTGSWKDDKMNG 
FGRLEHF SGAVYEGQ FKDNM FHG LGT YTFPNG AKYTGN FNENRV 
KGEGEYTHIQGTRMDWTFHFTSCSQT 


5597 


3 


731 


ISCKMAADGQSSLPASWRSVTLTHVEYPAGDLSGHLLAYLSLSP 
VFV I VGFVTLI I FKRELHTI S FLGGLALMEG VNVI L I KNVIQEPR 
PCGGPHTAVGTKYGMPSSHSQFMWFFSVYSFLFLYLRMHQTNNA 
RFLDLLWRHVLSLGLLAVAFLVSYSRVYLLYHTWSQVLYGGIAG 
GLMAIAWFIFTQEVLTPLFPRIAAWPVSEFFLIRDTSLIPNVLW 
FEYTVTRAEARNRQRKLGTKLQ 


5598 


326 

s 


2440 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAL 
VPIiLGSGVPPHPPAPSPCCSGQTMLKMLSFKLLLIiAVALGFFEG 
DAKFGERNEGSGARRRRCLNGNPPKRLKRRDRRMMSQLELLSGG 
EMLCGGFYPRLSCCLRSDSPGLGRLENKIFSVTIWTECGKLLEE 
I KCALCS PHSQSLFHSPEREVLERDLVLPLLCKDYCKEFFYTCR 
GH I PG FLQTTADE F C FY YAR KDGGLC F P D FPRKQVRG P ASNYLD 
QMEEYDKVEE1SRKHKHNCFCIQEWSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEIFKEPYLDIHKLVQSGIKGGDERGLL 
SLAFHPNYKKNGKLYVSYTTNQERWAIGPHDHILRWEYTVSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLY I ILGDGM 
I TLDDME EMDGLSDFTGSVLRLDVDTDM CNVP YS I PRSNPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS 
SARILQIIKGKDYESEPSLLEFKPFSNGPLVGGFVYRGCQSERL 
YGS YVFGDRNGNFLTLQQS PVTKQWQEKPLCLGTSGS CRG YFSG 
HIIiGFGEDELGEVYILSSSKSMTQTHNGKLYKIVDPKRPLMPEE 
CRATVQPAQTLTSECSRLCRKGYCTPTGKCCCSPGWEGDFCRTG 


5599 


326 


2440 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAL 
VPLLGSGVPPHPPAPSPCCSGQTMLKMLSFKIjLLIAVAIiGFFEG 
DAKFGERNEGSGARRRRCLNGNPPKRLKRRDRRMMSQLELLSGG 
EMLCGGFYPRLSCCLRSDSPGLGRLENKIFSVTNNTECGKLLEE 
IKCALCSPHSQSLFHSPEREVLERDLVLPLLCKDYCKEFFYTCR 
GHIPGFLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEEYDKVEEISRKHKHNCFCIQEWSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEIFKEPYLDIHKLVQSGIKGGDERGLL 
SLAFHPNYKKNGKLYVSYTTNQERWAIGPHDHILRWEYTVSRK 
NP HQVDLRTAR VFLEVAELHRKHLGGQLL FGPD GFLY 1 1 LGDGM 
I TLDDMEEMDGLSDFTGSVLRLDVDTDMCNVPYSI PRSNPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SARILQIIKGKDYESEPSLLEFKPFSNGPLVGGFVYRGCQSERL 
YGSYVFGDRNGNFLTLQQSPVTKQWQEKPLCLGTSGSCRGYFSG 
HILGFGEDBLGEVYILSSSKSMTQTHNGKLYKIVDPJCRPLMPEE 
CRATVQPAQTLTSECSRLCRNGYCTPTGKCCCSPGWEGDFCRTG 


5600 


1977 


1244 


S LR VLSGHLMQTRDLVQ PD KPAS PKF I VTLDGVP S P PG YMS DQE 
EDMCFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEMSELSVAQKPEKLLERCKYWPACKNGDECAYHHPISPCKA 
FPNC KFAE KCL FVHPNC KYD AKCTKPDCP FTHVSRRI P VLS PKP 
AVAP PAP P S S S QLCR YF PACKKMECPFYHPKHCR FNTQCTR PDC 
TF YHPTI NVP PRHALKW I RPQTS E 


5501 


±y / / 


1244 


SLRVLSGHLMQTRDLVQPDKPASPKFIVTLDGVPSPPGYMSDQE 
EDMCFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEMSELSVAQKPEKLLERCKYWPACKNGDECAYHHPISPCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 
TF YHPT I NVP P RHALKW I R PQTS E 


5602 


246 


766 


YHTS CTVWRTAKEALENTEVP VGCLM VYNNEWGKGRKEVNQTK 
NATRHAEMVA I DQVLDWCRQS GKS P S E VFEHTVLYVTVE PCI MC 
AAALRLMKI PL WYGCQNE R FGG CG S VLNIAS ADL PNTGRP FQ C 
I PG YRAEEAVEML KTFYKQENPNAPKS KVRKKECQQ I LNMF 


5603 


1 


565 


FRGRTPI SGGERGCAQYPI PATPARSGENRTMPGAGDGGKAPAR 
WLGTGLLGLFLLPVTLSLEVSVGKATDIYAVNGTEILLPCTFSS 
C FGFEDLH FR WT YNS SDAFK I L I EGTVKNEKSD P KVTLKDDDR I 

TLVGSTKEKRNNISIVLRDLEFSDTGKYTCHVKWPKENNLQHHA 
T I FLQ WDRRMQ 


5604 


1 


1506 


EDIFPAQLLKLQRHERVWQQEPPVRDHRSWGGSGAGGVAGREWT 
DQGQVALGGHYMAEGEGYFAMSEDELACSPYIPLGGDFGGGDFG 
GGDFGGGDFGGGD FGGGGS FGGHCLD Y CES PTAHCNVLNWE QVQ 
RLDGILSETIPIHGRGNFPTLELQPSLIVKWRRRLAEKRIGVR 
D VRLNGS AASHVLHQDSGLGYKDLDL I FCADLRGEGE FQTVKDV 
VLDCLLDFLPEGVNKEKITPLTLKEAYVQKMVKVCNDSDRWSLI 
SLSNNSGKWELKFVDSLRRQFEFSVDSFQIKLDSLLLFYECSE 
NPMTETFHPTIIGESVYGDFQEAFDHLCNKIIATRNPEEIRGGG 
LLKYCNLLVRGFRPASDEIKTLQRYMCSRFFIDFSDIGEQQRKL 
E S YLQNHFVGLEDR K YE YLMT LHG WNE S T VCLMGHERRQTLNL 
I TMLA I RVLADQNV I PNVANVTC Y YQ PAP YVADANFSNY Y I AQ V 
Q PVFTCQQQT YSTWLPCN 


5605 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRAWSAGGPALGL 
MAAPVRLGRKRPLPACPNPLFVRWLTEWRDEATRSRHRTRFVFQ 
KALRSLRRYP LPLRSGKEAKILQHFGDGLCRMLDERLQRHRTS G 
GDHAPDS PSGENSPAPQGRLAEVQDSSMPVPAQPKAGGSGS YWP 
ARHSGARVI LLVLYREHLNPNGHHFLTKE ELLQRCAQKS PRVAP 
GSARPWPALRSLLHRNLVLRTHQPARYSLTPEGLELAQKLAESE 
GLSLLNVGIGPKEPPGEETAVPGAASAELASEAGVQQQPLELRP 
GEYRVLLCVDIGETRGGGHRPELLRELQRLHVTHTVRKLHVGDF 
VWVAQETNPRDPANPGELVLDHI VERKRLDDLCSS 1 IDGRFREQ 
KFRLKRCGLERRVYLVEEHGSVHNLSLPESTLLQAVTNTQVIDG 
FFVKRTADIKESAAYLALLTRGLQRLYQGHTLRSRPWGTPGNPE 
SGAMTSPNPLCSLLTFSDFNAGAIKNKAQSVREVFARQLMQVRG 
VS G E KAAAL VDR YS TPAS LLAAYDACATPKEQETLLST I KCGRL 

ORNLG PAT . R TT ,Q ClT .VP Q Vf3 PT .T 


5606 


3 


1099 


GRSRCPGPGARGGTMSPRSCLRSLRLLVFAVFSAAASNWLYLAK 
LSSVGSISEEETCEKLKGLIQRQVQMCKRKLEVMDSVRRGAQLA 
I EE CQ YQFRNRR WNCS TIjDS LP VFGKWTQGTREAAFVYAI S SA 
GVAFAVTRACS S G ELE KCGCDRTVHG VS PQGFQWSGCS DN I AYG 
VAFSQSFVDVRERSKGASSSRALMNLHNNEAGRKAILTHMRVEC 
KCHGVSGS CEVKTCWRAVPPFRQVGHALKEKFDGATEVE PRRVG 
S S RALVPRNAQFKPHTDEDLVYLE PS PDFCEQDMRS GVLGTRGR 
TCNKTSKAIDGCELLCCGRGFHTAQVELAERCSCKFHWCCFVKC 
RQCQRLVELHTCR \ 
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Amino acid segment containing signal peptide 
{A^Alanine, C=Cysteine, D= As par tic Acid, E= 
Glutamic Acid, F=Phenylala*iine, G=Glycine, 
HssHistidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5607 


521 


141 


PP VCNPAEAMPS PGTVCS LLLLGMLWLDLAMAGSSFLS PEHQRV 
QQRKES KKP P AKLQ PRALAG WLR P EDGG QAEGAEDELE VRFNAP 
FDVGIKLSGVQYQQHSQALGKFLQDILWEEAKEAPADK 


5608 


2 


983 


WFQSPLRQADPGPPRHTLFMDFVAGAIGGVCGDAVGYPLDTVKV 
R I QTEPKYTG I WHCVRDT YHRERVWGFYRGLLL P VCTVS LVS S E 
VFGTYRHCLAHICRLRFGNPDAKPTKADITLSGCASGLVRVFLT 
SPTEVAKVRLQTQTQAQKQQRRLSASGPLAVPPMCPVPPACPEP 
KYRGPLHCLATVAREEGL03LYKGSSALVLRDGHSFATYFLSYA 
VLCEWLSPAGHSRPDVPGVLVAGGCAGVLAWAVATP^©VIKSRL 
Q ADGQGQRR YRG LLHCM VT I VREEGP RVL FKGLVLNCCRAFPVN 
MWFVAYEAVLRLARGLLT 


5609 


1628 


304 


AKGVWVLPSPPPRPGRGALVSGSGLRRGRSGTSWRPRRMNHKSK 
KR I REAKRSARPELKDS LDWTRHNYYES FSLSPAAVADNVERAD 
ALQLSVEEFVERYERPYKPVVLLNAQEGWSAQEKWTLERLKRKY 
RNQKFKCGEDNDGYSVKMKMKYYIEYMESTRDDSPLYIFDSSYG 
EH P KRRKLLED Y KVP KF FTDDLFQ YAGE KRR PP YRW F VMG P PRS 
GTG I H I DPLGTS AWNALVQGHKRWCLFPTSTPRELI KVTRDEGG 
NQQDEAITWFNVIYPRTQLPTWPPEFKPLEILQKPGETVFVPGG 
WWHWLNLDTT I AI TQNFAS STNF P WWHKT VRGRPKLSR KW YR 
ILKQEHPELAVLADSVDLQESTGIASDSSSDSSSSSSSSSSDSD 
SECESGSEGDGTVHRRKKRRTCSMVGNGDTTSQDDCVSKERSSS 
R 


5610 


54 


1196 


LERTPASADMAWTKYQLFLAGLMLVTGSINTLSAKWADNFMAEG 
CX3GSKEHSFQHPFLQAVGMFLGEFSCLAAFYLLRCRAAGQSDSS 
VDPQQPFNPLLFLPPALCDMTGTSLMYVALNMTSASSFQMLRGA 
VI I FTGLFS VAFLGRRLVLSQWLGI LATIAGLVWGLADLLSKH 
DSQHKLSEVI TGDLL I I MAQI I VAIQMVLEEKFVYKHNVH PLRA 
VGTEGLFGFVILSLLLVPMYYIPAGSFSGNPRGTLEDALDAFCQ 
VGQQPL I AVALLGNISS I AFFNFAG I S VTKELSATTRMVLDS LR 
T W I WALS LALG WE AFHALQ I LG FL I LL IGTAL YNGLHRPLLGR 
LSRGRPLAEES EQERLLGGTRTPINDAS 


5611 


2 


577 


FVLPNRLGIPGSTFRGPGACASSSSLAASAKPGAGGSPALAMSG " 
ELSNRFQGGKAFGLLKARQERRIiAE INREFLCDQKYSDEENL PE 
KLTAFKE KYME FDLNNEGE I D LMS L KRMME KLG VP KTHLE M K KM 
I SE VTGGVSDT I S YRDFVNMMLGKRSAVLKLVMMFEGKANES S P 
KPVGPPPERDIASLP 


5612 


1 


721 


ASRDG YMDAT I APHR I P P EM PQ YGE ENH I FELMQAMWLCKHLNS 
SLLTLENLILNEFSYTATEARRLYLQRKTVPSALLVQLIQERLA 
E EDC I KQG WI LDG I P ETREQALR IQTLG I TPRHV I VLS APDTVL 
IERNLGKRIDPQTGEIYHTTFDWPPESEIQNRLMVPEDISELET 
AQKLLE YHRNI VRVI PS YPKI LKVISADQPCVDVFYQALTYVQS 
NHRTNAPFTPRVLLLGPVGS 


5613 


115 


1279 


RGVDPALRRAEKMLPLSIKDDEYKPPKFNLFGKISGWFRSILSD 
KTSRNLFFFLCLNLSFAFVELLYGIWSNCLGLISDSFHMFFDST 
AI LAGLAASVI S KWRDNDAFS YG YVPJVEVLAGFVNGLFLIFTAF 
F I FSEG VERALAP PD VHHERLLLVS I LGFWNL I G I FVFKHGGH 
GHSHGSGHGHSHSLFNGALDQAHGHVDHCHSHEVKHGAAHSHDH 
AHGHGH FHSHDGPS L KE TTG PSRQ I LQG VFLHI LADTLGS IG V I 
AS AIMMQNFGLM IAD P I CS I L I AI L I WS VI P LLRES VG I LMQR 
TPPLLENSLPQCYQRVQQLQGVYSLQEQHFWTLCSDVYVGTLKL 


5614 


3 


12*8 


LLSRNEHACPLQAGLGLTQRKPKAIRGREGRATNQGQGETQNER 
AP WGARQRLG VMAELQQLQE FE I PTGRE ALRGNHSALLRVAD YC 
EDNYVQATDKRKALEETMAFTTQALASVAYQVGNLAGHTLRMLD 
LQGAALRQVEARVSTLGQMVNMHMEKVARREIGTLATVQRLPPG 
QKVIAPENLPPLTPYCRRPLNFGCLDDIGHGIKDLSTQLSRTGT 
LSRKS I KAPATPAS ATLGRP PR I PE PVHLP WPDGRLSAASS AS 
S LAS AG SAEGVGGAPTP KGQAAP PAPPL PS SLOP P P P PAAVE VF 
QRPPTLEELSPPPPDEELPLPLDLPPPPPLDGDELGLPPPPPGF 
GPDEPSWVPASYLBKWTLYPYTSQKDNELSFSEGTVICVTRRY 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=lTnknown, **=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








S DG WCEG VS S EGTG F F PGN YVE P S C 


5615 


9 


1558 


ALGRRRPGDPREMEAAATPAAAGAARREELDMDVMRPLINEQNF 
DGTSDEEHEQELLPVQKHYQLDDQEGI s fvqtlmhllkgnigtg 
LLGLPLAIKNAGIVLGPISLVFIGIISVHCMHILVRCSHFLCLR 
FKKSTLGYSDTVSFAMEVSPWSCLQKQAAWGRSWDFFLVITQL 
GFCSVYIVFLAENVKQVHEGFLESKVFISNSTNSSNPCERRSVD 
LRI YMLCFLP FI I LLVFIRELKNLFVLS FLANVSMAVS LVT I YQ 
YWRNMPDPHNLP I VAGWKKY P LF FGTAV FAFEG IG WLPLENQ 
M KE S KRF PQALNIGMG I VTTL YVTLATLG YM CFHDE I KGS I TLN 
LPQDVWLYQSVKILYSFGIFVTYSIQFYVPAEIIIPGITSKFHT 
KWKQICEFGIRSFLVSITCAGAILIPRLDIVISFVGAVSSSTIiA 
L I L P PLVE I LTFS KEH YNI WMVLKN I S I AFTG WGFLLGTYI TV 
EEI IYPTPKWAGTPQS PFLNLNSTCLTSGLK 


5616 


1 


719 


DDFVRCX3PQSAAMGASARLLRAVIMGAPGSGKGTVSSRITTHFE 
LKHLSSGDLLRDNMLRGTEIGVLAKAFIDQGKLIPDDVMTRLAL 
HELKNLTQYSWLLDGFPRTLPQAEALDRAYQIDWINLNVPFEV 
IKQRLTARWIHPASGRVYNIEFNPPKTVGIDDLTGEPLIQREDD 
KPETVIKRLKAYEDQTKPVLEYYQKKGVLETFSGTETNKIWPYV 

yaflqtkvpqrsqkasvtp 


5617 


176 


765 


P WRGRGS R PRGAGAMAEE Q VNRSAGIoAPDCEASATAE TTVS SVG 
TCE AAGKS PEP KD YDSTCVFCR I AGRQDPGTELLHCENEDli I C F 
KD I KPAATHHYLWPKKH IGNCRTLRKDQVELVENMVTVGKTIL 
ERNNFTDFTNVRMGFHM PPFCS I SHLHLHVLAPVDQLGFLS KLV 
YRVNSYWFITADHLIEKLRT 


5618 

• 


3 


1692 


YLNYINLKSENKLSGKEDLWEKLQYLWKSTLNLPEDLLRVPDES 
LFLNSGGDSLKSIRLLSEIEKLVGTSVPGLLEIILSSSILEIYN 
HILQTWPDEDVTFRKSCATKRKLSNINQEEASGTSLHQKAIMT 
FTCHNE INAFWLSRGS Q I LSLNS TR FLTKLGHCS SAC P S DS VS 
QTNIQNLKGLNSPVLIGKSKDPSCVAKVSEEGKPAIGTQKMELH 
VRWRS DTGKC VDAS PL W I P TFD KS STT VY IGSHS HRM KAVD FY 
SGKVKWEQI LGDR I ESS AC VS KCGN F I WGC YNGL VYVLKSNSG 
EKYWMFTTEDAVKSSATMDPTTGLIYIGSHDQHAYALDIYRKKC 
VWKSKCGGTVFSS PCLNLI PHHLYFATLGGLLLAVNPATGNVI W 
FCHSCGKPLFSSPQCCSQYICIGCVDGNLLCFTHFGEQVWQFSTS 
GP I FSSPCTS PSEQKI FFGSHDCF I YCCNMKGHLQWKFETTS RV 
YATPFAFHNYNGSNEMLLAAASTDGKVW I LESQSGQLQS VYELP 
GE VFS S P WLE S ML I IGCRDN YVY C LDLLGGNQK 


5619 


2160 


1477 


DS PVLPTSGNVI STAQPAQPWSAVEAALRSLGS P PGAGRGCPCP 
AQSLHS HQLAAWD P LKP S LRS Y PPHLLQHPQLRS LTASSGHLGR 
RS CPQ P R PLE ELLRAGS STRPQ PLTS S CCGMSCM YS FLGHC S VL 
LWGTKGRGSGS PSS PGCCLHPPAQHSQDLPLVHVDVGWQPPLGP 
TVGLRPGLLGERQRGALRAGDPQCQCPLPATVREDLGVPSPWAA 
ECSPPATP 


5620 


930 


162 


PLPPPTLAMFLTRSEYDRGVUTFSPEGRLFQVEYAIEAIKLGST ■ 

AIGIQTSEGVCLAVEKRITSPLMEPSSIEKIVEIDAHIGCAMSG 

LIADAKTLIDKARVETQNHWFTYNETMTVESVTQAVSNLALQFG 

EEDADPGAMSRPFGVALLFGGVDEKGPQLFHMDPSGTFVQCDAR 

AI GSAS EGAQS S LQEVYHKSMTLKEAI KS S LI ILKQ VMEEKLNA 

TNIELATVQPGQNFHMFTKEELEEVIKDI 


5621 


3 


819 


VVEFVEYTATDANVKWESLSSVQQI^IKMTVRYGKFLSLLKDGA 
ENDLTWVLKHCERFLKQQQTS I KS SLLCLQGNYAGHDWF VS S LF 
M I MLGDKE KTFQ FLHQ FSRLLTS AFLWL P RLHI S S YLPNDTVE S 
GIHPVYFCSTHYIEMLLKAELPLVFSAFHMSGFAPSQICLQWIT 
QCFWNYLDWIEICHYIATCVFLGPDYQVYICIAVFKHLQQDILQ 
HTQTQDLQVFLKEEALHGFRVSDYFEYME I LEQNYRTVLLRDMR 
NIRLQST 


5622 


1122 


456 


AASTKDAVSRKRSHSASEKSGTGTSISKRLNMNPQIRNPMKAMY " 
PGTFYFQFKNLWEANDRNETWIXFTVEGIKRRSWSWKTGVFRN 
QVDSETHCHAERCFLSWFCDDILSPNTKYQVTWYTSWSPCPDCA 
GEVAEFLARHSNVNLT I FTARL Y YFQYPCYQEGLRSLSQEGVAV 
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Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl a 1 ani ne , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








EIMDYEDFKYCWENFVYNDNEPFKPWKGLKTNFRLLKRRLRESL 
Q 


5623 


3 


954 


FLPFFIRAPKISRNGQWLFTFTTPFPFANKALPGWEGIVPACFW 
RKKI LTPS TGTM E LLQ VT I LFLL PS I CSSNS TGVLEAANNS L W 
TTTKPS ITTPNTE S LQ KNWTPTTGTTPKGT I TNELLKMS LMST 
ATFLTS KDEGLKATTTDVRKNDS 1 1 SNVTVTS VTLPNAVSTLQS 
S KPKTE TQSS I KTTE I PGS VLQPDAS PS KTGTLTS I P VT I PENT 
SQSQVIGTEGGKNASTSATSRSYSS I ILPWIALIVITLSVFVL 
VGLYRMCWKADPGTPENGNDQPQSDKESVKLLTVKTISHESGEH 
SAQGKTKN 


5624 


159 


898 


PG VAAAAGAL PQ YHG PAP AL VS CRRELSLS AGS LQLER KRRDFT 
SSGSRKLYFDTHALVCLIiEDNGFATQQAEIlVSALVKILEANMD 
IVYKDMVTKMQQEITFQQVMSQIANVKKDMIILEKSEFSALRAE 
NEKI KLELHQLKQQVMDEVI KVRTDTKLDFNLEKSRVKELYSLN 
EKKLLELRTEIVALHAQQDRALTQTDRKIETEVAGLKTMLESHK 
LDN I KYLAGS I FTCLTVALG F YRLW I 


5625 


1 


1180 


TIPSSAAAQRAGPPAGALEALSPGGARAHAERRGEMRATPLAAP 
AGSLSRKKRLELDDNLDTERPVQKRARSGPQPRLPPCLLPLSPP 
TAPDRATAVATASRLGPYVLLEPEEGGRAYQALHCPTGTEYTCR 
VYPVQEALAVLEPYARLPPHKHVARPTEVLAGTQLLYAFFTRTH 
GDMHSLVRSRHR I PEPEAAVLFRQMATALAHCHQHGLVLRDLKL 
CRFVFADRERKKLVLENLEDS CVLTGPDDSLWDKHAC PAYVGPE 
ILS SRAS YSGKAAD VWSLGVALFTMLAGHYPFQDS E PVLLFGKI 
RRGAYALPAGLSAPARCLVRCLLRRE PAERLTATG I LLHPWLRQ 
DPM PLAPTRSHLWEAAQ WP DGLGLDEAREEEGDREWL YG 


5625 


3123 


2011 


PPRALGS VAMENQVLTPHVYWAQRHRELYLRVELSDVQNPAI S I 
TENVLHFKAQGHGAKGDNVYEFHLEFLDLVKPEPVYKLTQRQVN 
ITVQKKVSQWWBRLTKQEKRPLFIiAPDFDRWLDESDAEMELRAK 
EEERLNKLRLESEGSPETLTNLRKGYLFMYNLVQFLGFSWIFVN 
LTVRFC I LG KE S FYDTFHTVADMMYFCQMLAWET I NAAIGVTT 
S P VLPS L IQLLGRNFI LFI I FGTMEEMQNKAWFFVFYLiWSAI E 
IFRYSFYMLTCIDMDWKVLTWLRYTLWI'PLYPLGCLAEAVSVIQ 
S I P I FNETGRFS FTLP YPVK I KVR FS FFLQ I YL IM I FLGL Y I NF 
RHL YKQRRRRYGQ KKKKIH 


5*27 


3123 


2011 


PPRALGSVAMENQVLTPHVYWAQRHRELYLRVELSDVQNPAISI 
TENVLHFKAQGHGAKGDNVYEFHLEFLDLVKPEPVYKLTQRQVN 
ITVQKKVSQWWERLTKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EEERLNKLRLESEGSPETLTNLRKGYLFMYNLVQFLGFSWIFVN 
LTVRFC I LGKESFYDTFHTVADMMYFCQMLAWETI NAAIGVTT 
S PVLP SL I QLLGRN F I LF I I FGTMEEMQNKAWFF VFYLWSAI E 
I FRYSFYMLTCIDMDWKVLTWLRYTLWI PLYPLGCLAEAVSVIQ 
S I PIFNETGRFSFTLP YPVKI KVRFSFFLQI YLIMI FLGLY INF 
RHL YKQRRRRYGQ KKKKIH 


5628 ■ 


75 


1455 


VAGAMASKCLKAGFSSGSLKSPGGASGGSTRVSAMYSSSPCKLP 
SLS PVARS FSACSVGLGRSSYRATSCLPALCLPAGGFATS YSGG 
GGWFGEG I LTGNEKETMQSLNDRLAGYLEKVRQLEQENASLESR 
I RE WCEQQ VP YMC P D YQS YFRT I E ELQKKTLCS KAENARL WE I 
DNAKLAADDFRTKYETEVSLRQLVESDINGLRRILDDLTLCKSD 
LEAQVESLKEELLCLKKNHEEEVNSLRCQLGDRLNVEVDAAPPV 
DLNRVLEBMRCQYETLVENNRRDAEDWLDTQSEELNQQWSSSE 
QLQSCQAE 1 1 ELRRTVNALE I ELQAQHSMRDALESTLAETEARY 
SSQIiAQMQCMITNVEAQLAEIRADLERQNQEYQVLLDVRARLEC 4 
EINTYRGLLESEDSKLPCNPCAPDYSPSKSCLPCLPAASCGPSA 
ARTNCS ARP I CVPCPGGRF 


5629 


228*7 


938 


GRPRSSSDNRNFLRERAGLSSAAVQTRIGNSAASRRSPAARPPV 
PAP P ALPRGRPGTEGS TS LS AP AVL WAVAWWWSAVAWAMA 
NYIHVPPGS PE VPKLNVTVQDQEEHRCREGALSLLQHLRPHWDP 
QE VTLQLFTDG I TNKL I GC Y VGNTMED WLVR I YGNKTE LLVDR 
DEEVKSFRVLQAHGCAPQLYCTFNNGLCYEFIQGEALDPKHVCN 
PAIFRLIARQLAKIHAIHAHNGWIPKSNLWLKMGKYFSLIPTGF 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Ei= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=*Glut amine, R^Arginine, 
S^Serine, T»Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADEDINKRFLSDIPSSQIIiQEEMTWMKEILSNLGSPWLCHNDL 
LCKNI I YNEKQGDVQFI DYSYSGYNYLiAYDIGNHFNEFAGVSDV 

nVCT VnHDDT /~N O /"\TYIT T*5 7V'VT ~1/tj*"T~i i~itxj~i i~i-~imi~ 

V Y b Jj Y PDRELiQSQWLRAYIjS AY KE F KGFGTE VTE KEVE I L FIQV 
NQ FALASH F FWGLWAL I QAKYST I E FDFLG YA I VRFNQ YFKMKP 
EVTALKVPB 


5630 


1194 


278 


GFWAI AQTCAHHLPPGS PWLVPAS PWRLPEMSS FGYRTLTVALF 
TLICCPGSDEKVFEVHVRPKKLAVEPKGSLEVNCSTTCNQPEVG 
GLETSLDKILLDEQAQWKHYLVSNISHDTVLQCHFTCSGKQESM 
NSNVS VYQ P PRQ VI LTLQ PTLVAVG KS FT I ECR VPTVEPLDSLT 
LFLFRGNETLH YE TFG KAAPAPQEATATFNSTADREDGHRNFS C 
LAVLDLMSRGGNI FHKHS APKMLE I YE PVSDSQMVI IVTWSVL 
LSLFVTSVLLCFI FGQHLRQQRMGT YG VRAAWRRL PQAFR P 


5631 


1053 


290 


SRVDDF VRPEPSRAE PS RSGRRR P ARRAATMS VFGKLFGAGGGK 
AGKGGPTPQEAIQRLRDTEEMLSKKQEFLEKKIEQELTAAKKHG 
TKNKRAALQALKRKKRYE KQLAQ I DGTLST I EFQREALENANTN 
TEVLKNMGYAAKAMKAAHDNMDIDKVDELMQDIADQQEIiAEEIS 
TAI S KP VGFGE E FDEDE LMAELEE LEQE ELDKNLLS I SGPETVP 
LPNVPS IALPSKPAKKKEEEDDDMKELENWAGSM 


5632 


3 


952 


WLGWS PPRRLWWGSLGAAQR PAVPVSGLARSLHVETRRPHRRA 
SVRVARGRLGVWAQPQPLLPRPVGSRREMQPPGPPPAYAPTNGD 
FTFVS SADAEDLSGS IAS PDVKLNLGGDFI KESTATTFLRQRGY 
GWLLEVEDDDPEDNKPLLEELDIDLKDIYYKIRCVLMPMPSLGF 
NRQWRDNPD FWG P LAWLFFS MIS L YGQ FRWS WIITIWIFGS 
LTIFLLARVLGGEVAYGQVLGVIGYSLLPLIVIAPVLLWGSFE 
WS TL I KLFGVFWAAYSAAS LLVGEE FKTKKPLL I YP I FLIiYIY 
FLSLYTGV 


5633 


771 


460 


QGCSKTMSVGRPFYRSSEFMEQLLSSHLHQVPFFCCFTWCLCN 
CLFENS VS KL YM LC FN F FMS I F F YS LS I TKLNL I YL WGL S YQ S L 
LLLLLSGHRPWGSSMV 


5634 


1446 


855 


PRATGRIRSRAAASRPRAGAGASGAEPRSGRERSRLSGRRAPAM 
ARNTLS SRFRR VD I DEFDENKF VDEQBEAAAAAAEPGPDPS EVD 
GLLRQGDMLRAFHAALRNS P VNTKNQAVKERAQG WLKVLTNFK 
SSEIEQAVQSLDRNG VDLLMKY I YKGFEKPTENSSAVLIjQWHEK 
ALAVGGLGS I IRVLTARKTV 


5635 


3 


■ 943 


DRGPRSTATDTGRARVSFWRFPLDPGVKNSNVQISGEKRRFRTL 
RS LFH P FP VTRSGAPRAVLVGS S W P AKMVAPAVKVARG WSGLAL 
GVRRAVLQLPGLTQVRWSRYSPEFKDPLIDKEYYRKPVEELTEE 
EKYVRELKKTQLIKAAPAGKTSSVFEDPVISKFTNMMMIGGNKV 
IARSLMIQTLEAVKRKQFEKYHAASAEEQATIERNP YTI FHQAL 
KNCEPMIGLVPILKGGRFYQVPVPLPDRRRRFLAMKWMITECRD 
KKHQRTLM PE KLSH KLLEAFHNQG P VI KRKHDLHKMAEANRALA 
HYRWW 


5636 


2253 


1143 


LEDTICQHPPAEKKLYLYHRKLREVERNGIPRLPKDVFMDTHQG 
LTD VRAKVTG FS EG WDS VKGG FS S FSQ ATHS AAG AWS KPRE I 
ASLIRNKFGSADNIPNLKDSLEEGQVDDAQKAIiGVISNFQSSPK 
YGS EEDCSSATSGSVGANSTTGGI AVGASSS KTNTLDMQS SGFD 
ALLHEIQEIRETQARLEESFETLKEHYQRDYSLIMQTLQEERYR 
CERLEEQLNDLTELHQNEILNLKQELASMEEKIAYQSYERARDI 
QEALEACQTRISKMELQQQQQQWQLEGLENATARNLLGKLINI 
LLAVMAVL L VFVSTVAN CWP LMKTRNRT FS TL FLWF IAFLWK 
HWDALFSYVER FFSft PR 


5637 


948 


2532 


MSFCGARANAKMMAAYNGGTSAAAAGHHHHHHHHLPHLPPPHLH " 
HHHHPQHHLHPGSAAAVHPVQQHTSSAAAAA7^AAAAAAAMLNPG 
QQQPYFPS PAPGQAPG PAAAAPAQ VQAAAAAT VKAHHHQH SHH P 
QQQLDIEPDRPIGYGAFGWWSVTDPRDGKRVALKKMPNVFQNL 
VSCKRVFRELKMLCFFKHDNVLSALDILQPPHIDYFEEIYVVTE 
LMQS DLHK 1 1 VS PQPLS S DHVKVFL YQ I LRGLKYLHS AG I LHRD 
IKPGNLLVNSNCVLKICDFGLARVEELDESRHMTQEWTQYYRA 
PE I LMGSRH YSNAI DI WS VGCI FAELLGRR I LFQAQS P I QQLDL 
ITDLLGTPSLEAMRTACEGAXAHILRGPHKQPSLPVLYTLSSQA 
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amino acid 
residue of 
amino acid 
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location 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I»Isoleucine, K«Lyeine, 
L= Leucine, M»Methionine, N-Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








THEAVHLLCRMLVFDPYKRISAKDALAHPYLDEGRLRYHTCMCK 
CCFSTSTGRVYTSDFEPVTNPKPDDTPEKNLSSVRQVKE I IHQF 
ILEQQKGNRVPLCINPQSAAFKSFISSTVAQPSEMPPSPLVWE 


5638 


125 


1155 


DRKMSELDQLRQEAEQLKNQIRDARKACADATLSQITNNIDPVG 
RIQMRTRRTLRGHLAKI YAMHWGTDSRLLVSASQDGKL I I WDSY 
TTNKVHAI PLRSS WVMTCAYAPSGN YVACGGLDNI CS I YNLKTR 
EGNVRVSRBLAGHTGYLSCCRFLDDNQIVTSSGDTTCALWDIET 
GQQTTTFTGHTGDVMSLSLAPDTRLFVSGACDASAKLWDVREGM 
CRQTFTGHES D I NA I C FF PNGNAFATGSDDATCRLFDLRADQEL 
MTYSHDNIICGITSVSFSKSGRLLLAGYDDFNCNVWDALKADRA 
GVLAGHDNRVSCLGVTDDGMAVATGSWDSFLKIWN 


5639 


125- 


1155 


DRKMSELDQLRQEAEQLKNQIRDARKACADATLSQITNNIDPVG 
R I QMRTRRTLRGHLAKI YAMHWGTDS RLL VS ASQDGKL I I WD SY 
TTNKVHAI PLRSSWVMTCAYAPSGNYVACGGLDN I CS I YNLKTR 
EGNVRVSRELAGHTG YLS C CRFLDDNQ I VTS SGDTTCALWD I ET 
GQQTTT FTGHTGD VMS LS LAPDTRLF VSGACDAS AKLWD VREGM 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRLFDLRADQEL 
MT Y SHDN I ICG I TS VS FS KS GRLLLAG YDDFNCNVWDAL KADRA 
GVLAGHDNRVSCLGVTDDGMAVATGSWDSFLKIWN 


5640 


280 


1092 


QQ GNKKTMLS HNTMMKQRKQQ ATAI M KE VHGNDVDGMDLGKKVS 
IPRDIMLEELSHLSNRGARLFKMRQRRSDKYTFENFQYQSRAQI 
NHSIAMQNGKVDGSNLEGGSQQAPLTPPNTPDPRSPPNPDNIAP 
GYSGPLKEIPPEKFNTTAVPKYYQSPWEQAISNDPELLBALYPK 
LFKP EGKAE LP D YRS FNR VATPFGG FE KASRM VKF KVPD FE LLL 
LTD PRFMS FVNPLSGRRS FNRTPKG W I S ENI P I VI TTEPTDDTT 
VPESEDL 


5641 


27 


332 


CRHNCNGDVKLLSNQMDKLFAFHLFTFHGLLHFLDGSIQKLIQA 
EIILSDNSSILVLENNFLFKVKSKQFIHLIAKKFYISITIVSAS 
NGESFVLSMIVTG 


5642 


199 


1247 


ITPCRMDFLVLFLFYLASVLMGLVLICVCSKTHSLKGLARGGAQ 
IFSCI IPECLQRAMHGLLHYLFHTRNHTFIVLHLVLQGMVYTEY 
TW EVFGYCQE LELSLHYLLL P YLLLGVNLFF FTLTCGTN PG 1 1 T 
KANELLFLHVYEFDEVMFPKKVRCSTCDLRKPARSKHCSVCNWC 
VHR FDHHCVWVNNC I GAWN I R YFL I YVLTLTASAATVAI VS TTF 
LVHLWMSDLYQETYIDDLGHLHVMDTVFLIQYLFLTFPRIVFM 
LG F VWLS FLLGG YLL FVLYLAATNQTTNEWYRGDWAWCQR C P L 
VAWPPS AEPQVHRN IHSHGLRSNLQE I FLPAFPCHERKKQE 


5643 


1 


847 


PSGGVRDVETRGPGSRAARGPRWMHRRGVGAGAIAKKKLAEAK 
YKERGTVLAEDQLAQMSKQLDMFKTNLEE FAS KHKQE IRKNPEF 
RVQFQDM CAT IG VDPLASGKG FW SEMLG VGDF YYELG VQ 1 1 EVC 
LALKHRNGGLITLEELHQQVLKGRGKFAQDVSQDDL I RAIKKLK 
ALGTGFGI IPVGGTYLIQSVPAELNMDHTVVLQLAEKNGYVTVS 
EIKASLKWETERARQVLEHLLKEGLAWLDLQAPGEAHYWLPALF 
TDLYSQEITAEEAREALP j 


5644 


83 


1138 


PRRMGSWVQLITSVGVQQNHPGWTVAGQFQEKKRFTEEVIEYFQ 
KKVSPVHLKILLTSDEAWKRFVRVAELPREEADALYEALKNLTP 
YVAIEDKDMQQKEQQFREWFLKEFPQIRWKIQESIERLRVIANE 
IEKVHRGCVI ANWSGSTGI LSVIGVMLAPFTAGLSLS ITAAGV 
GLGIASATAGIASSIVENTYTRSAELTASRLTATSTDQLEALRD 
I LHD I T PNVLS FALDFDEATKM I ANDVHTLRRS KATVGRPL IAW 
K x V tflJN VV ti 1 UK I KGAP i KI VRKVAKNLGl^TSGVLVVLDVVNL 
VQDSLDLHKGEKSESAELLRQWAQELEENLNELTHIHOSLKAG 


5645 


537 


799 


VQSVRDLKRLSPTDPPGDSGNRDVTREDPVTGPLNSASSQVPTL 
YLCLQNSLLGHSSVEDARATMELYQISQRIRARRGLPRLAVSD 


5646 


3745 


3328 


AEQYGTS PHLLPTMLLS S CLPPANVTTKAATPPPLVLSLTTADP 
AGKPAPCRVTLTLLRAS I PATKRASFLSSFI KMFFEELEYILGF 
LSLLKFHVHVS VYSAI CHFQKEGTGNSRS FTCTPELFPRLQTHL 
RAEGGAQ 


5647 


288 


800 


GVIMATSELSCEVSEENCERREAFWAEWKDLTLSTRPEEGCSLH 
E EDTQRHET YHQQGQCQVL VQRS PWLMMRMG I LGRGLQE YQLP Y 
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ID 
NO: 


Predicted 
beginning 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ricuicicQ cllQ 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
<jiucamic Acid, F^Pnenylalanxne, G»Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S»Serine, T=Threonine, V=Valine, 
Wss Tryptophan, Y=Tyrosine, X=Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QRVLPLPI FTPAKMGATKEEREDTP IQLQELLALETALGGQCVD 
RQEVAEITKQLPPWPVSKPGALRRSLSRSMSQEAQRG 


5648 


7 


1518 


VLSELCGRHEALREVGAEWPPPTCSPNICSGLQQAGNTDWSLTM 
APQSLPSSRMAPLGMLLGIiLMAACFTFCLSHQNLKEFALTNPEK 
i>s» l KJsioRIUsTKAEEELDAEVLEVFHPTHEWQALQPGQAVPAGS 
HVRLNLQTGEREAKLQ YEDK FRNNLKG KRLD I NTNT YTS QDLKS 
ALAKFKEGAEMESS KEDKARQAEVKRL FRP I EELKKDFDELNW 
IETDMQIMWLINKFNSSSSSLEEKIAALFDLEYYVHQMDNAQD 
LLSFGGLQWINGLNSTEPLVKEYAAFVLGAAFSSNPKVQVEAI 
EGGALQKLLVIIATEQPLTAKKICVLFALCSLLRHFPYAQRQFLK 
LGGLQVLRTLVQEKGTEVLAVRVVTLLYDLVTEKMFAEEEAELT 
QEMSPEKLQQYRQVHLLPGLWEQGWCEITVHLLALPEHDAREKV 
LQTLG VLLTTCRDR YRQD PQLGRTLAS LQ AE YQVLAS LELQDGE 
DEGYFQELLGSVNSLLKELR 


5649 


1172 


3006 


MLQEQLDAINEEIRMIQEEKESTELRAEEIETRVTSGSMEALNL 
KQLRKRGS IPTS LTDLS LASAS PPLSGRSTPKLTSRSAAQDLDR 
MG VMTLPSDLRKHRRKLLS PVSREENREDKATI KCETS PPSSPR 
TLRLE KLGH P ALS QEEGKSALE DQGSNPS S SNS S QDS LH KG AKR 
KGIKS S IGRLFGKKEKGRLIQLSRDGATGHVLLTDSEFSMQE PM 
VPAKLGTQABKDRRLKKKHQLLEDARRKGMPFAQWDGPTWSWL 
ELWVGMPAWYVAACRANVKSGAIMSALSDTEIQREIGISNALHR 
LKLRLAIQEMVSLTSPSAPPTSRTSSGNVWVTHEEMETIjETSTK 
TDSEEGSWAQTLAYGDMNHEWIGNEWLPSLGLPQYRSYFMECLV 
DARMLDHLTKKDLRVHLKMVDS FHRTS LQYG I MCLKRLNYDRKE 
LEKRREESQHE I KDVLWTNDQWHWVQS IGLRD YAGNLHESGV 
HGALLALDENFDHNTLAL I LQ I PTQNTQARQ VMERE FNNLLALG 
TDRKLDDGDD KVFRRAP S WRKRFR PREHHGRGGMLS AS AETL PA 
GFRVSTLGTLQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


5650 


1172 


3006 


MLQEQ LDA INEEIRMIQEE KESTELRAEE I ETR VTSGS M E ALNL 
KQLRKRGS I PTS LTDLSLASAS PPLSGRSTPKLTSRSAAQDLDR 
MG VMT L PS DLRKHRRKLLS P VS REENREDKAT IKCETS PPSSPR 
TLRLE KLGHPALS QEEGKSALEDQGSNPS S SNSS QDS LHKGAKR 
KGIKSSIGRLFGKKEKGRLIQLSRDGATGHVLLTDSEFSMQEPM 
VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQWDGPTWSWL 
ELWVGMPAWYVAACRANVKSGAIMSALSDTEIQREIGISNALHR 
LKLRLAIQEMVSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 
TDS EEGS WAQTLAYGDMNHEW IGNEWLPS LGLPQYRS YFMECLV 
DARMLDHLTKKDLRVHLKMVDS FHRTS LQYG I MCLKRLNYDRKE 
LE KRRE E S QHE I KD VLVWTNDQ WHWVQS I GLRD YAGNLHESGV 
HGALLALDENFDHNTLALILQ I PTQNTQARQVMERE FNNLLALG 
TDRKLDDGDDKVFRRAPS WRKRFR PREHHGRGGMLSASAETLPA 
GFRVSTLGTLQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


5651 


646 


1869 


ARQGQRQP WG * EARAKGPASES PRV* EGSG WEGPAS P *TPGSTL 
AWGEGAGIR*ASGLTAAGAASAAAA/PPPTRGGPAPAGCGRAPP 
WPAPLRVPTHGRAPAPRSRAAPRAPALSHGTAAAALS PAS PAGP 
AUk> * jjf (jHS SQS PPRG * RWGR S RS APAPAHPEH PAP AGS AS ASQ 
QTPGWPGSCCLAQGWQAEPLGAPGAEDG\PVPPQRGFPLGTLGS 
PAGS WAGLAG YG * AGAPGTQATAPRAAGQT P VAAAPN CRV* GSA 
PALHRAPAAADPGSPLQAPPRAWASPAAAGPGLSSSDYCGGLGA 
G WRAGI SPELLGAAGLSDNWARCPGPG PAE *GGQPGCRTI PAS A 

CMPSPPVEGSLGLSRKGHGDLPSQAR*GWHECRRARHLVPLPRL 
LGPRGRTGRPSSPS 


5652 
" 5*53 


735 


343 


HHKKYQHIHQKSFSCPEPACX3KSFNFKKHLKEHMKLHSDTRDYI 
CEFCARSFRTSSNLVIHRRIHTGEKPLQCEICGFTCRQKASLNW 
HQRKHAETVAALRFPCEFCGKRFEKPDSVAAHRSKSHPALLLA 




66 


1401 


RGRLQS RGRLTLGL VLL LLD I LGARQH6QR VSHG W KGG FLTAPL 
CFPQPCQPGTRRGRRRSLKEATEPQLAMAEEFVTLKDVGMDFTL 
GDWEQLGLEQGDTFWDTALDNCQDLFLLDPPRPNLTSHPDGSED 
LEPLAGGS PEATS PDVTETKNSPLMEDFFEEGFSQEI /SRDVIQ 
G WLLE LQ FRRS L YRGHLVR * FARRSRKSSEV* YCHQRGKSHGMQ 
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amino acid 
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amino acid 


Predicted end 
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sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, -R=Arginine, 
S=Serine, T=Threonine, V~Valine, 
W«Tryptophan, Y«Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ES * I KERTQS C VHR FHGRR FHG \ DNVS E KTLT PAKS KE YRGEF F 
SYSDHSQQDSVQEGEKPYQCSECGKSFSGSYRLTQHWITHTREK 
PTVHQECEQGFDRKASHSGYPKTHTGYKFYVCNEYGTPFSQSTY 
LWHQKTHAGEKPCKSQDSDHPPSHDTQSGEHQKTHTDSKSYNCN 
E CG KAFTR I FHLTRHQ K I HTRKR YE CS KCQAT FNLR KHL I QHQK 
THAANV 


5654 


3 


598 


TLPLFPGRRFRGWRRCGAVAARKNSTGGNVSINQRRDSVRMSAL 
NWKP FVYGGLAS ITAE CGTFP I DLTKTRFQ IQGQTNDAKFKE 1 1 
YRGMLHALVRIGREEGLKALYSG*VGLHAFLCHCSLFHMGIDFR 
PRLHRSQVKSLRCV*KEQIA* * /MFSLLISTLISKYIYYAADVL 
EKLFYYIQVQTDNNKKICLFKNI 


5655 


2 


867 


RPPGIRAPRQLHPAAGRRPDASARPRFRPTVLLHDPFQLSFPPP 
PLS YP S VFP AVAR VL P QRSGDYRAAGMPQLSGGGGGGGGD P E LC 
ATDEMIPFKDEGDPQ\REKIFAEIVNPEEEGDLADIKSSLVNES 
EI I PASNGHE VARQAQTSQEPYHDKAREHPDDGKHPDGGLYNKG 
PS YSS YSGYI MMPNMNNDP YMSNGSLS PP I PRTSNKVPWQPSH 
AVHPLTPLITYSDEHFSPGSHPSHIPSDVNSKQGMSRHPPAPDI 
PTFYPLSPGGGGQITPPLGWQGQP 


5656 


228 


1066 


PRRVP PLPE FASGPGAAFFHSGRLQRS LTKDSAGCFSQCRS RAM 
LVLRS GLTKALASRTLAP QVCSS FATGPRQ YDGTFYEFRT Y YLK 
PSNMNAFMENLKKNIHLRTSYSELVGFWSVBFGGRTNKVFHIWK 
YDNFPHRAEVRKALANCKEWQEQS I I PNLARIDKQETE ITYLI P 
WS KLQKP PKEG V YE LAVFQM KPGG PALWGDAFE RAI NAHVNLG Y 
TKWGVFHTEYGELNRVHVLWWNESADSRAAVRHKSHEDPISWG 
GVRESVNYIAVSQQNM 


5657 


105 


1052 


GQRLQSPRVQMPVQPPSKDTEEMEAEGDSAAEMNGEEEESEEER 
SGSQTESEEES SEMDDED YERRRS ECVS EMLDLE KQFSELKEKL 
FRERLSQLRLRLEEVGAERAPEYTEPLGGLQRSLKIRIQVAGIY 
KGFCLDVIRNKYECELQGAKQHLESEKLLLYDTLQGELQERIQR 
LEEDRQSLDLSSEWWDDKLHARGSSRSWDSLPPSKRKKAPLVSG 
PYIVYMLQEIDILEDWTAIKKARAAVSPQKRKSD\DLDPAVHSQ 
GDPQSSWHCTQDSRLPPADRRTHRPLRVCPARLLWCCWALPLHL 
ALVWTPPL 


5658 


2346 


3541 


TERRVYNPWPEPDPD\CIQEDPWNLPNSIKTLVDNIQRYVEDGK 
NQLLLALL KCTDTE LQLRRDA I FCQAL VAAVCT FS EQLLAALG Y 
R YNNNGE YEES S RDASRKWLEQVAATGVLLHCQSLLS PAT VKEE 
RTMLEDIWVTLSELDNVTFSFKQLDENYVANTNVFYHIEGSRQA 
L KV I F YLD S YHFS KL P SRLEGGASLRLHTALFTKVLENVEGLPS 
PGSQAAEDLQQD INAQSLEKVQQYYRKLRAFYLERSNLPTDAS T 
TAVKI DQL I RP I NALDELCRLMKSFVHPKPGAAGS VGAGL I P I S 
S E LCYRLGACQM VMCGTGMQRSTLS VS LEQAAI LARSHGLLP KC 
IMQATD I MR KQG PRVE I LAKNLR VKDQMPQGAPRL YRLCQPKMN 
GDL 


5659 


2 


696 


wkrsgevspkgelgawrgnsgrpkiigraaeaenedrtlgrlLp 

GNE RS QPRS PLRLLAPQLKAE AAADKGLAPVPP P FS S GHSGP C\ 
EREGEGQRGRGRSRRGAHLELKPSPGLRAGAPTDRGRGGPAEVA 
AAGGRRM VQKES QATLEERE S E LS SNPAASAGAS LE P P AAPAPG 
EDNPAGAGG \AAVAGAAGGARRFLCGWEGFYGRPWVMEQRKEL 
FRRLQKWELNTYL 


5660 


229 


853 


P VTMWAFS E LPM P L L I NL I VS LLGFVATVTL I P AFRGH FI AARL ' ' 
CGQDLNKTSRCXDIPESQGVISGAVFLIILFCFIPFPFLNCFVKE 
QRKAFPHHEFVALIGALLAICCMIFLGFADDVLNLRWRHKLLLP 
TAAS LPLLMVYFTNFGNTT I WP KPFR P I LGLHLDLGR * S YHCC 
P YGT YFRE P FLVLH I L LQ VFLFCLCVF PDP FW 


5661 


2 


473 


LNLYPSPCGGIPKLPGLPREAAAALGASFLAEAPLPVTVRGSGL " 
AGMAVTCDPKAFLS ICFVTLVFLQLPLAS ICQN*GTDSCASRGK 
AD FD VTG PHAP I LAMAGGHVELQCQLF PN I S AE DMELRW YRCQP 
SLAVHMHERGMDMDGEQKWQYRGRT 


5662 


2 


1318 


LRKEGRCRRGSNRGVWAAPAEGLGGRGMLGVRCLLRSVRFCSSA 
PFPKHKPSAKLSVRDALGAQNASGERIKIQGWIRSVRSQKEVLF 
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Amino acid segment containing signal peptide 
<A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl al anine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y- Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHVNDGSSLESLQVVADSGLDSRELTFGSSVEVQGQLIKSPSKR 
QNVELKAEKIKVIGNCDAKDFPIKYKERHPLEYLRQYPHFRCRT 
NVLGSILRIRSEATAAIHSFFKDSGFVHIHTPIITSNDSEGAGE 
LFQLEPSGKLKVPEENFFNVPAFLTVSGQLHLEVMSGAFTQVFT 
FGPTFRAENSQSRRHLAEFYMIEAEISFVDSLQDLMQVIEELFK 
ATTMrtVLS KCPEDVELCHKF I APGQ KDRL * HML KNNFL IIS YTE 
AVE ILKQAS QNFTFTPEWGADLRTEHEKYLVKHCGNI PVFVINY 
PLTLKPFYMRDNEDGPQELEGSVA*HSLGLMILLSIVVIGQP 


5663 


119 


698 


PADIGRSTAKTPGPPRSLEMDDPRYGMCPLKGASGCPGAERSLL 
VQSYFEKGPLTFRDVAIEFSLEEWQCLDSAQQGLYRKVMLENYR 
NLVFLGIALTKPDLITCLEQGKEPWNIKRHEMVAKPPVICSHFP 
QDLWAEQDI KDS FQEAILKKYGKYGHANFQLQKGCKS VDECKVH 
KEHDNKLNQCLI PKKKK 


5664 


118 


572 


SLSMESNHKSGDGLSGTQKEAALRALVQRTGYSLVQENGQRKYG 
GPPPGWDAAPPERGCEIFIGKLPRDLFEDELIPLCEKIGKIYEM 
RMMMD FNGNNRGYAFVT FSNKVEAKNAI KQLNNYE IRNGRLLGV 
CAS VDNCRLFVGGI PKTKK 


5665 " 


347 


702 


WQHL 1 1 LLHCE RTS PAM I TS EL P VLQ DS TNETTAH S DAGS ELE 
ETE VKGKRKRGRPGRPPSTNKKPRKSPGEKSR I EAG IRGAGRGR 
ANGHPQQNGEGE PVTLFEWKLGKS AMQRC 


5666 


213 


540 


VSCLPTSCKMITLNNQDQPVPFNSSHPDEYKIAALVFYSCIFII 
GLFVN ITALW VFSCTTKKRTTVTI YMMNVALVDLI FIMTLPFRM 
FY YAKDEWPFGE YFCQ I LGA 


5467 


1 


695 


HPLPSASLGLPS VS LGVSLCVRS ALLEAWPML P KRRRARVGSP 
SGDAAS STPPSTRFPGVAI YLVEPRMGRSRRAFLTGLARS KGFR 
VLDACSSEATHVVMEETSAEEAVSWQERRMAAAPPGCTPPALLD 
I S WLTES LGAGQ P VP VECRHRLEVAGP S KGPLS P AWMPAYACQR 
PTPLTHHNTGLSEALEILAEAAGFEGSEGRLLTFCRAASVLKAL 
PSPVTTLSQLQ 


5668 


691 


894 


CSFLFCIPDLFLQFLLGRKEEEAVLVGGEWSPSLDGLDPQADPQ 
VLVRTAI RCAQAQTGIDLSGCTKW 


5669 


407 


1 


DSGAPEGLSPLMSTQEGLSMHAHPQAYTPFIYLHARKRRGEIGD 
ADSRFTORYAHKSAQLYFLYFVCWIFQDVYYFTIKEKNHFFFPK 
ARGAPTKYSGSPIGSPTTTPPTRPPSFNLHPAPHLLASMQLQKL 
NSQ 


5670 


3 


373 


SSECLTMAWIPLLLPtiLILCTVSVASYELAQPSSVSVSPGQTAK 
ITCSGDVLAKKYARWFQQKPGQAPVLVIYKDTBRPSGIPERFSG 
S TSGTTVTLTI SGAQVEDEAD YFCYSATDNFLWVF 


5671 


280 


524 


KFPPKKTPPHLGMESAITLWQFLLQLLLDQKHEHLICWTSNDGE 
FKLLKAKKVAKLWGLRKNKTNMNYD KL S RALRLL FMT 


5672 


2 


557 


FVPATPDPGVWLPPSRDPAMAKRSSLYIRIVEGKNLPAKDITGS 
SDPYCIVKVDNEPIIRTATVWKTLCPFWGEEYQVHLPPTFHAVA 
FYVMDEDALSRDDVIGKVCLTRDTIASHPKGKFSLPSHTGLPSP 
WPPSHSETSPLGSVWSPAQGKPFLLSPEAGATFCTPGLCSAACS 
QAWLLLPLP 


5673 


327 


696 


IT VADQI SHWSAGR I KNRTRI PECIHSSAATTLAGPHTMEGE S V 
KLSSQTLIQAGDDEKNQRTITVNPAHMGKAFKVMNELRSKQLLC 
DVMIVAEDVEIEAHRWLAACSPYFCAMFTGDMS 


5674 


17 


984 


GGGSMEGESTSAVLSGFVLGALAFQHLNTDSDTEGFLLGEVKGE 
AKNS I TDSQMDD VE WYT ID I QK YI P C YQL FSF YNS S GEVNEQA 
LKKILSNVKKNWGWYKFRRHSDQIMTFRERLLHKNLQEHFSNQ 
DL VFLL LTPS 1 1 T E S CS THRLEHSL YK PQ KGLFHR VP L WANLG 
MSEQLGYKTVSGSCMSTGFSRAVQTHSSKFFEEDGSLKEVHKIN 
EMYAS LQEELKS I CKKVEDS EQAVDKL VKDVNR LKRE I EKRRGA 
QIQAAREKNIQKDPQENIFLCQALRTFFPNSEFLHSCVMSLKID 
MFLKVAVTTTTISM 


5675 


80 


753 


EG S RRG PTRLARLS ARAGRLHFP PGFS S RL I HFRGVS ECRRP PG 
KSGVPVSAPGSDGKWWEERPGMFSLMASCCGWFKRWREPVRKVT 
LLMVGLDNAGKTATAKG IQGE YPEDVAPTVG FS K INLRQGKFE V 
TI FDLGGG I RIRG I WKNYYAES YGVI FWDS SDEERMEETKEAM 
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(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glut amine, R=Arginine, 
S«Serine, T=Threonine, V«=Valine, 
WaTryptophan, Y=Tyrosine, X= Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SEM LRH PRISGKPI L VLANKQDKEG ALG E AD VI E CLS LE KL VNE 
HKCL 


5676 


2 


930 


FVSSPPPRPVQPARPGGFGLSGRRSLLCQVASTPAHVGVMRSPV 
RDLARNDGE E S TDRT PLLPGAP RAE AAPVCCS AR YNLAI LAP FG 
FF 1 VYALRVNLS VAL VDMVDSNTTLEDNRTS KAC P EHS AP I KVH 
HNQTGKKYQWDAETQGWILGSFFYGYI ITQ I PGGYVAS KIGGKM 
LLGFGI LGTAVLTLFTP I AADLGVG PLI VLRALEGLGEGVTFPA 
MHAMWSS WAPPLERS KLLSIS YAGAQLGTVI S LPLSG I I CYYMN 
WTYVFYFFGTIGIFWFLLWIWLVSDTPQKHKRISHYEKEYILSS 
L 


5677 


1 


1028 


PPRDGFLELRRLSVPLCSGPCPLTSLSRQGERSGGHLVAAARAA 
VTAETHPLPLLAPLAVCQS VKS PAACQVRPRPRAVALPAALGGP 
GRSLPGLTAATMSSFSESALEKKLSELSNSQQSVQTLSLWLIHH 
RKHAGP I VS VWHRELRKAKSNRKLT FL YLANDVIQNS KRKGPE F 
TREFESVLVDAFSHVAREADEGCKKPLERLLNIWQERSVYGGEF 
IQQLKLSMEDSKSPPPKATEEKKSLKRTFQQIQEEEDDDYPGSY « 
SPQDPSAGPLLTEELIKALQDLENAASGDATVRQKIASLPQEVQ 
DVSLLEKITDKEAAERLSKTVDEACLRNRGPGTS 


5678 


3 


593 


SSSPPSSTPSLPLPFYLLLGQLRLQLLWGTAHLSGAGEAAPCPG 
GSGRTAAPRTRADPAAQSLMIMNKMKNFKRRFSLSVPRTETIEE 
S LAE FTEQ FNQLHNRRNENLQLGP LGEiDP PQECS TFS PTDSGE E 
PGQLSPGVQFQRRQNQRRFSMEVRASGALPRQVAGCTHKGVHRR 
AAALQPDFDVSKRLS LPMDI 


5679 


2 


623 


LNSRVDDFVAVPGAIMDEDYYGSAAEWGDEADGGQQEDDSGEGE 
DDAEVQQECLHKF9TRDYIMEPSIFNTLKRYFQAGGSPENVIQL 
LS ENYTAVAQTVNLLAE WL I QTGVEP VQVQETVENHLKS LLIKH 
FDPRKADS I FTEEGETPAWLEQM I AHTTWRDLF Y KLAE AHPDCL 
MLNFTVKVGRVLELRRKVFMNVYFWLLVCFL 


5680 


258 


592 


RRLTSTSEKLQNRNSHTPLESLIHPQPSYKGFGIMFGKKKKKIE 
ISGPSNFEHRVHTGFDPQEQKFTGLPQQWHSLLADTANRPKPMV 
DPSCITPIQLAPMKTIVRGNKPC 


5681 


45 


869 


LLCAKTLGVRTKESQAEGYNRSGINNHQAEDPRFCPSFCWMRSA " 
RQTR PQRLRKEAARPPT PGS CPGGTGMDGKKCS VWMFLPLVFTL 
FTSAGLWIVYFI AVEDDKI LPLNSAERKPGVKHAPYIS IAGDDP 
PAS CWSQVMNMAAFLALVVAVLRFI QLKPKVLNPWLNISGLVA 
LCLAS FGMTLLGNFQLTNDEEIHNVGTSLTFGFGTLTCW IQAAL 
TLKVNIKNEGRRVGIPRVILSASITLCVGPLLHPHGPKHPHVCS 
QGPVGPGHVL 


5^82 


39 


622 


PSRSCLGTMRKWRHREVNLPEVTQQDAVCPAPIPSPGLSAQTGL 
QKIWGTIHCQVCPGAPAWPGSPWHEEMGLLLLVPLLLLPGSYGL 
PFYNGFYYSNSANDQNLGNGHGKDLLNGVKLWETPEETLFTYQ 
GAS VI LP CRYRYEPALVS PRRVRVKWWKLSENGAPEKDVLVAIG 
LRHRS FGDYQGRVHLRQD j 


5683 


89 


778 


GSCGATALITRCLAWSVLISRLAMATYTCITCRVAFRDADMQRA 
HYKTDWHRYNLRRKVASMAPVTAEGFQERVRAQRAVAEEESKGS 
ATYCTVCS KKFAS FNAYENHL KSRRH VELEKKAVQAVNRKVEMM 
NEKNLEKGLGVDSVDKDAMNAAIQQAI KAQPSMS PKKAPPAPAK 
EARNWAVGTGGRGTHDRDPSEKPPRLQWFEQQAKKIiAKHSEDD 
SEDEEHDLC 


5684 


195 


677 


twcfrgylgprvimkaldeppyltVgI-dvsAkyrgafceakikt 
akrlvkvkvtfrhdsstvbvqddhikgplkvgaivevknldoay 
qeavinkltdaswytwfddgdektlrrsslclkgerhfaeset 
ldqlpltnpehfgtpvigkktnrgrrye 


5685 


779 


1262 


lllqqpwhcfllfppfrfshhmipgppgphttgiphpaivtpO 
vkqehphtdsdlmhvkpqheqrkeqepkrphikkplnafmlymk 
emranwaectlkesaainqilgrrwhalsreeqakyyelarke 
rqlhmqlypgwsardnyvspssipvalhs 


1 5686 


128 


1181 


ctwwqwitlldindnhptwkdapyyinlvemtppdsdvttvva 
vdpdlgengtlvysiqppnkfyslnsttgkirtthamldrenpd 
pheaelmrkiwsvtdcgrpplkatssatvfvnlldlndndptf 
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"(A^Alanine, C=Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q«Glutamine, R=Arginine, 
S-Serine, T=T*hreonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QNLPF VAE VLEGI PAGVS I YQ WAI DLDEGLNGLVS YRMP VGM P 
RMDFLINSSSGVWTTTELDRERIAEYQLRWASDAGTPTKSST 
STLTIHVLDVNDETPTFFPAVYNVSVSEDVPR\GSGWSG*AARN 
NDVGLNAELSYFITGGNVDGKFSVGYRDAWRTWGLDRETTAA 
YML I LEAI DNG P VGKRHTGTAT VFVTVLD YNDKRPI ILQS S YV 


5687 


17 


917 


AAPPAPPDG/PPP/PPPAPPT/PGPAA/APASSCQPRLSAGRAA 
QGDGGAAAVGH VL WP AVGPVR VNPGLQTP VP RPELLPG P \ S SS 
LHSDSSYPPDAGLSDDEEPPDASLPPDPPPLTVP/ADA/PMPVT 
SGCRM PSTSASE /AAGGQGACTHAKGS ETPPPAS PQTSEPAPSP 
LPPHLTGGPGMYSSEAKLPNSFSCLGLAGTGAGI*GTASAHGrG 
PPVLPHVCTPSLANPQP\AVGPEASSLPLGVSGIGMSA/SAPIS 
SSPFVAIGSCWLRGI P PPGSGFLCPGRAPGPVPI TTHGQEGQGP 
VLDI 


5688 


1 


420 


LTKWDLFGNCYRLLKTGIEHGAMPEQVGVYWYS/CLYDSRKLFF 
*SHMI IRSLL*KVIDDSLGQLPLLRELLL* *LNVIDRCI ILAYV 
LRVEKTFAI T YL KNFTVKVDFSLLGE I PLI S MAA I LKLWI MK ID 
DGYIPAVF 


5689 


1504 


3 


HE LSG KH I SM VS GNTCN WH PGGHS PGGGGQG EITSKDRGEI PAL 
IWA/RKPIGTWTATKPTHRAG*GGAEFYOPPPnPrF , r;DP<I r T , QPr 
GEG *GHAVGPGRE I GKEG S L PFLG P KALGF * S AS CQRAFEGGAH 
GSTARKPAPATPGTRHPRTMETREVAQGWPAGPRSQFWDQHPHS 
PGEHRPSG\S PLPACPPRAWPKAGAVASATGTG\ PQLPGSRGKQ 
KLPRTREPPLLQAGWAVRKPPWSEAKEGLGQAGRPSGMDSSAS\ 
PQTPGGRGSLEWG LPLYLGPHHDVK* RSDRLG* PP * GGQGGGGH 
GAPSTPGPGGEAW*LPQQTSRPKPGPQAY*GE\GSPGLQCPCSK 
EL*RVPPGSLGPSTQCMYEPTDKHS\GGADAQLEVSTAGSRSTF 
GQELKGPLDAGRLWPGAPS ASSSHR* GG * ERARAGAGHRGST * A 
S S KI EQGRPR PG PTS DALADVEGGAES /G PHPW P L PGTLPNR / P 
GS PPPA+ AS AGRKGTVSTLGGGLL 


5690 


1424 


58 


PSPPAGVCAAPAPLPLLALARRDRRPCSPGAEAAPWQTGGPAID " 
GAWRTS VS ALRRGATG/APCS PGAEAAPWQTGGPAI DG\DGELP 
*VRSEEAPRGCGAEGGGPGSGPVRRPGAGRGAHAGQGRQQDPEP 
DGLRHRQHGAASHARHRLQRLRPGHHQNRHVRRDPQAPPGGPAP 
GHAAALPERTRGVAEPPAWAHAGSDAWRAGR*SQRT*ERARPRH 
PTFQGRAGS\GQPGYQPPNPHPGPSSPPAAP\GPRGA*GNPQLE 
KAPRSDRNPSQGLRTRIRRPETPDCGPPSPAGSSASASTFRCTS 
SLSLLGP/ PGAHNLDTAPQDR* HGP*GDKRGAPGVAGEDPRPP * 
GNFVR * LLLMP/GVA* RHGTS P FLGPS LGENGGQ WDSGNLFGTP 
KG * SH P AFTKST * SME AEKS Y WNH PHR \ DRGRQG VR I NCLR VG E 
S EM WG P YSAP RPGT VFLSS FL S PASEEH \ PE GS SS FNTP FP PAG 
PEGDPGLNS PGLLP 


5691 


107 


550 


ISNDPSPGYNIEQMAKRGKKLVELPYTVKGMDVSFSGILSFIED 
VAHRMLATGECTPEDLCFSLQVMQ * KTGTES WG*RFYI VEQN* S 
GDAPL I FSP YLSLTGNCGFAMLVEITERAMAH\CGS PGGPSLWG 
GVGVYVLLESVPLSYS 


5692 ■ 


1193 


54 8 


TQAWTRAEKDRKGSVRAIiRIiHLERGPPT*RGSHPL\QSVPCIQK " 
PS I FSSYPI /GLPQSGGEPGPVGEQQPVRRPEQPSCGPASRMPIj 
TSRSVPPGRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQ 
RLNLP VMGATRSNLQP PRKVAVPG PTR * RDQDS KQDFS S KPLQS 
VPGLASTQQTLTPADSGPGTGGRDATRAGLPGVETMGNGVD 


5693 


1258 


1330 


ALTWPVRKGTTWWAQPHGCSNLVSRARLDLSSRPSQNTEPQAP ' 
*QAGPPSSLRPP\SRRR*APEWPKRATGSRCRGLSAPPWPWPAA 
RGE/PGSAPSHAP/PNSPRPSGTRHP/PGPSSRVLYSPSLPRNS 
PEAIVWRSSRFPLWFPLRCCFWVSGFKDPNPVLRFF 


5694 


3 


1338 


GSKEPARSLHRRGSGHKSSAGKWGSVTLSTAGALG*KQLHQ*WT 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTSPITHTAQSAL 
KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IARPS TSGS FG YKKP P PATGTATVMQTGGSATLS KIQKS SG I P V 
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L=Leucine, [^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, V«Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGG PRPVSSSIDPSLLST KQGGLT PSRLKE PTKVASGRTT 
PAPVNQTDREKEKAKAKAVALDSDNISLKSIGSPESTPKNQASH 
PTATKIiAELPPTPLRATAKSFVKPPSLANLDKVNSNSLDLPSSS 
DTTQCI 


5695 


3 


1338 


GS KE PARS LHRRG S GHKSSAG KWGS VTLS TAGALG * KQLHQ * W T 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAE SGLS WFS ESE E KAPKKLE YDSGS LKME PGTS KWRRERPES 
CDDS SKGGELKKPI SLGHPGSLKKGKTPPVAVTS P I THTAQSAL 
KVAGKPEGKATDKG KLAVKNTGLQRSSS DAGRDRLSDAKKPPSG 
IARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 
KPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRPVSSS ID PSLLSTKQGGLTPSRLKE PTKVASGRTT 
PAPVNQTDREKEKAKAKAVALDSDNISLKSIGSPESTPKNQASH 
PTATKLAE LP PTPLRAT AKS F VKP P S LANLDKVNSNS LDLPS S S 
DTTQCI 


5696 


3 


1338 


GS KE PARS LHRRGSGHKS SAGKWGS VTLS TAGALG * KQLHQ * WT 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMBPGTSKWRRERPES 
CDDS SKGGELKKPI SLGHPGSLKKGKTPPVAVTS PITHTAQSAL 
KVAGKPEGKATDKG KLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IARP STSG S FG Y KK P P PATGTAT VMQTGGS ATLS KI Q KS S G I PV 
KP VNGRKT S LD VSNS AE PG FLAPGAR SNIQ YRS L PRPAKS S S MS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKE PTKVASGRTT 
PAP VNQTDRE KE KAKAKAVALD S DNI S LKS IGSPESTPKNQASH 
PTATKLAELPPTPLRATAKS FVKP PSLANLDKVNSNSLDLPS S S 
DTTQCI 


5*97 


1147 


47 


PSEALSPPACPSAPAPRRSIISRLFGTSPATEAAPPPPEPVPAA 
QGPATVQSVEDFVPDDRLDRSFLEDTTPARDEKKVGAKAAQQDS 
DSDGEALGGNPMVAGFQDDVDLEDQPRGSPPLPAGPVPSQDITL 
SS EEEAEVAAPTKGPAPAPQQCSE PETKWSS I PAS KPRRGTAP T 
RTAAPP WPGG VS VRTGPEKRSS TRPPAEMEPGKGEQASSS ESD P 
EGP IAAQMLS F VMDDPDFESEGSDTQRRADDFPVRDDPSDVTDE 
DEGPAEPPPPPKLPLPAFRLKNDSDLFGLGLEEAGPKESSEEGK 
EGKTPSKENKKKKKKGKEEEEKAAKKKSKHKKSKDKEEGKEERR 
RRQQRPPRSRERTAA 


5698 


2 


666 


GAEAAE PQEDLP PLSQS SRFFQEQQKMNKSLGPVSFKDVAVDFT 
QEEWQQLDPEQKITYRDVMLENYSNLVSVGYHIIKPDVISKLEQ 
GEEPWIVEGEFLLQSYPDEVWQTDDLIERIQEEENKPS.RQTVFI 
ETLI * R / E RGNVPGNTFD VETNP VP SRKIAYTHS LCNS CER \GF 
NAS SE YI S S DGR YARMKADE CSGCGKSLLH I KLE KTHPGDQA YE 
FNQ 


5^9 


2 


1448 


RVRQP PGLWVRRTVPAMQCPAGLSRVPGVAG/DPSLPS FRGPRD 
EAAHRGTIQTARHTRKLYVQGPASGPPLPRVSTQVA I * DEKPLA 
R PS / GRTNAP FP QGQKPAG KAAPGPAAAGR VAMR \ PGHPGLLAS 
DSQRSSSKGSGWETPVPWS*AQPGWVSGLLLLGDPSGPGSL+RS 
TWLVGGARGPEGSGVRGSGWPSGCSDIGWALAGWNHS*HLDPNT 
WTQKWTGE/SPAPGEEG\VAPAPRGPTAEHGHCELTTESQYSNN 
VPILFQNPSGALRSRRTEPAGWVP PTRHE * DDG*TAAPASGGAP 
VSTPTWAGTP/LNASLGPTDPQGKPGCRPPCALPKPAGPERSA* 
GGSLG CR /S MLPAS S G P P PAPG PRRLAAGAHTS AS ARCP PAAAA 
GWQPRRPGFAGRAALPGPPHPPSS*RELGGLPGPGW*TLDPLPA 
HPAHPPGSAPPWGALGGWAAARASLPWSPSLCLSFPAVTPVAGL 
FPPGRG 


5700 


923 


597 


NGHKGVWE IN I Y* RRSNI HKNS KS ESHLNQDHS FPP PTPNSARS 
KLHSTGTAKNTGLPLSGAPRQRAVFSGRTI CQE FS SCLQCAYLD 
E * CS I AS SL I KAILRVS VLSE 


5701 


59 


410 


I FEKI CSDTQE FI S PE INPQI CS WL I FDKGAK/ NHATGKDSLFN 
KWS WKNWLSTCR*MRPGPYFTPYTKINSK* I K/DANIRCETVKL 
LE ENTGENLHDTG LGNVFLDMTP KTQPTKQK 
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Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=*Valine, 
W^Tryptophan, Y=Tyrosine, X^Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 


5702 


. 3 


1517 


ETFVDPSQCGGIPSDSPHPVITPSRASESSASSDGPHPVITPSR 
ASESSASSDGPHPVITPSRASESSASSDGLHPVITPSRASESSA 
SSDGPHPVITPSRASESSASSDGPHPVITPSRASESSASSDGLH 
PVITPSRASESSASSDGPHPVITPSWSPGSDVTLIAEALVTVTN 
I E VINCS I TE I ETTTS S I PGASDTDLI PTEGVKASS TSDP PALP 
D S TEAKPH I TEVTAS AETLSTAGTT E SAAPHATVGTPLP TNSAT 
EREVTAPGATTLSGALVTVSRNPLEETSALSVETPSYVKVSGAA 
PVSIEAGSAVGKTTSFAGSSASSYSPSEAAIiKNFTPSETLTMDI 
TTKGPFPTSRDPLPSVPPTTTNSSRGTNSTLAKITTSAKTTMKP 
PTATPTTARTRPTT\A*VQVKMEVSSSCG*VWLPRKTSLTPEWQ 
KG*CSSSTGNSTPTRLTSRSPYCVSGEANG/PSAAARHVPYAKR 
G C CP * PG P P PTDCS CVTVLRGTQ KVPMKGS MS KPLT PDVATGP S 
LTS TGVY VWGGAS P VPRG VLGLTLAH VLC FS KE KT 


5703 


14 


1117 


HHKDSRSQGLPRTQECARPELRPLLCPRALWPVTRLSYRCPWQA 
PKAGIGTKAKPSESHLKLHPGWPSLDRQGEPATLGTGTGHCSDS 
RILRWHP *HTAAR* PRWRRLPSSHRWTRHLGVLRVQDKS * * VSL 
DPSCRPRFLRTC**YGMRSVASSSNPPPGWSGPGASVFPARPVS 
ALPTGPRCW+APRGRTRQPCGWPRLSSPHATADWGPGCPLSPSR 
GSWETAPGS+WCPWL*AARWTGWRTASGASAGLGRAADRPSAWA 
RRVAGLLPGQGLTVRR * H* TAGAPAS VRS S QGATRS PAPGGDQC 
ACGRGPGSC+HPPPWPVSPSSPVPCPSGR*HLRGPLLSAARPRA 
AGWPRHSPHDTQTPEP 


5704 


23 


562 


GDYEFDSPYWDDISQAAKDLVTRLMEVEQDQRITAEEAISHEWI 
SGNAASDKNI KDGVCAQ I EKNFARAKWKKAVRVTTLMKRLRAPE 
Q S S TAAAQS AS ATDTATPGAAGGATAAAASGATS APEGD AARAA 
KSDNVAPRRP * LPPQPQMEVPPQPLMAVSPQPPMEASLQPLMGE 
SPQP 


5705 


23 


562 


GD YE FD S P YWDD I S QAAKDL VTRLM E VEQDQR I TAEEAI S HEWI 
SGNAASDKNI KDGVCAQ I EKNFARAKWKKAVRVTTLMKRLRAPE 
QS S TAAAQS AS ATDTAT PG AAGGATAAAASGATS APEGDAARAA 
KSDNVAPRRP*LPPQPQMEVPPQPLMAVSPQPPMEASLQPLMGE 
SPQP 


5706 


1161 


610 


QLGRFXAQDTVAIRKVKEVFGTGAMRHWILFTHKED*GGQALD 
DYVANTDNCSLKDLVRECERRYCAFNNWGSVEEQRQQQAELLAV 
IERLGREREGSFHSNDLFLDAQLLQRTGAGACQEDYRQYQAKVE 
WQVEKHKQELRENESNWAYKALLRVKHLMLLHYE I FVFLLLCSI 
LFFIIFLF 


5707 


28 


609 


GSPAPTPGPRRRPGRGTPSPGTRHHQGRAEPEPDAPERAPLRR* 
MFAIQPGLAEGGQFLGDPPPGLCQPELQPDSNSNFMASAKDANE 
NWHGM PGRVE P I LRRS S S ES PSDNQAFQAPGS P E EG VRS P PEGA 
EIPGAEPEKMGGAGTVCSPLEDNGYASSSLSIDSRSSSPEPACG 
TPRGPGPPDPLLPSVAQA 


5708 


44 


1925 


SFSWEETISPCFPKMPAEPWWLSPVSLGAAGWPGQPRPYLDLPA 
QASVSRPHDRA+GEAVSLSLSSGDVCGHTDGGGAGSDPQAKPKP 
PRCPFTAMPSPRTKQKVRNKVCLLIAIRYSDIPSDVSKAP\GPA 
GNPHDRSSTAA*LHRRAGAGSLCLSASLLPPSFSLGAPGAPSPL 
RVS PASGGPRKEGRQGSGG * AGGGGP \ ARTHADLPCVGFVCS PP 
LLK*SDSPVKQLPA\SGQGSGAGMPPVGSSDILRPRPTSVSGTG 
RAAG*CSWQPAACCTPRSQ*WAVARSPSRCSRW*RQSGR*RG*S 
SRRRRGP * AAGRSTPAVP * PCS * GGAGRRAYACRTGWGYAPSR* 
LEPSGPTSGSAL*TWASHSTGA**SRLCGTAGTGPLCSQSSRS* 
AG*RCCCTAASPCGGSGPSHPGSPSAHCLSWSGGRTQPRAPSAH 
GRGRAMG S RCVCTCTGL PC PG I PLS GAS PGGSGETGAGRSHTLK 
AARSRLS PRPGSGSRGS Y+ SHNDNWGTWPAPPSAGHLLVGG*NS 
QRTS SDH * YTGTRRP WAG PGTRCSTAPSRAAP P VSRCRP PPP P P 
PPRPPRLPAAAS / SGGASGS PAASCSCSCRAPAKPAS S / GEAPA 
PPPRPEPPPPPARRP 


5709 


2 


2031 


ITLCPLPQTEKCLNVVTEAATPLGIYLKARVEAGGLKELEISWG 
LHQ I WRWGAWMRAGMGGCR CWGVMAPFAP R/NALS FLVNDCS 
LIHNNVCMAAVFVDRAGEWKLGGLDYMYSAQGNGGGPPRKGIPE 
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corresponding 
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amino acid 
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amino acid 
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Predicted end 
nucleotide 
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amino acid 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEQ YD P PE LADS S GR WREKRS ADM WRIjGCL I WE VFNG P L PRAA 
ALRNPGKI PKT L VPH YCEL VGAN P KVRPNP AR FLQNCRAPGGFM 
SNRFVETNLFLEEIQIKEPAEKQKFFQELSKSLDAFPEDFCRHK 
VLPQLLTAFEFGNAGAWLTPLFKVGKFLSAEEYQQKI I PVWK 
MFSSTDRAMRIRLLQQMEQFIQYLDEPTVNTQIFPHWHGFLDT 
NPAIREQTVKSMLLLAPKLNEANLNVELMKHFARLQAKDEQGPI 
RCNTTVCLGK I GS YLS AS TRHRVLTS AFSRATRD P FAP S RVAGV 
LfGFAATHNLYSMNDCAQKILPVLCGLTVDPEKSVRDQAFKAIRS. 
FLSKLESVSEDPTQIEEVEKDVHAASSPGMGGAAASWAGWAVTG 
VSSLTSKLIRSHPTTAPTETNIPQRPTPEGVPAPAPTPVPATPT 
TSGHWETQEEDKDTAEDSSTADRWDDEDWGSLEQEAESVLAQQD 
DWSTGGQVSRASQVS\TPTTNPPNPQSPTGAAGK\RGLLGTGLA 
GAKLPGATS * R YTAGQRV 


5710 


1 


562 


IPGSTISCEVELMARMAKTIDSFTQNQTRLWIIDGLDACEQDK 
VLQMLDTVRVLFSKGPFIAI FASDPHI I IKAINQNLNSVPSGFK 
\LNGHDYMRNI VHLPVFLNSRGL/RQ/LQENFS * LQQQMETFHA 
QILQGYRKMLTEEFHRTALGR*QNLVARQPSIDG*DAIGFELYV 
CIAI QFNTNKDDAT 


5711 


1526 


1130 


RRH P FQWTT VTQE AFSHHD VAFTS TP VL FYPDS AQ P F I VKS ESS 
SQIAKAVLSQQRPSLFHECAFHFFS*SLQRHTINLDQGIF*LLM 
LSEERQHLFESS / I WTTPHNLK* / FE IHEHLGSHEGHWTLFFLL 
QIL 


5712 


3 


1391 


GRKLFQSLDISERLKFLLTLDCVDDTLIVLAEEHGCLDIIKELP 
ETVIDLLNKCLTFHPSKRPTPDELMKDKVFSEVSPLYTPFTKPA 
SLFSSSLRCADLTLPEDISQLCKDINNDYLAERSIEEVYYLWCL 
AGGDLEKELVNKEI IRSKPPI CTLPNFLFEDGES FGQGRDRSS/ 
T FR * YHWD I WM PAKK* I ERCWGRS I LP I TLKMTS L I LP YSNSN 
NELS AAATLPL 1 1 RE KDTE YQLNR 1 1 L F DRLLKA YP Y KKNQ I WK 
EARVD I PPLMRGLTWAALLGVEGA IHAKYDAIDKDTP I PTDRQ I 
E VDI PRCHQYDELLS SPEGHAKFRRVLKAWWSHPDLVYWQGLD 
SLCAPFLYLNFNNEALVYACMSAFIPKYLYNFFLKDNSHVIQEY 
LTVFSQMIAFHDPELSNHLNEIGFIPDLYAIPWFLTMFTHVFPL 
H KI FHLW \ DTLLLGE FL FP I L YWE 


5713 


634 


284 


PVCAVPVDRWPVLPREDQEGQQL*AKLPRDFRR*FQILGPMEGH 
TACRCSRRGAQVQHLPREDIRAAE*DPHLREVWPGLPTSSATSP 
* RAVLTS PCSHLGS ADAAS SHWLCGVS FH 


5714 


212 


613 


WGLGLGPTMSSLGGGSQDAGGSSSSSTNGSGGSGSSGPKAGAAD 
KS A WAAAAP AS VADDTP P PERRNKS G 1 1 S E PLNKS LRRSRPLS 
HYSSFGSSGGSGGGSMMGGESADKATAAAAAASLLANGHDLAAA 
MA 


5715 


131 


1979 


ESASQQKRSKCLILTLKLELSGSAPKKTSARPGSSLWLPPHSQE 
QTPPASKLQGGGGGLQTGWGLHPVPVTAASPLPRWCLFGAVAK\ 
GLPGP*LCPSGAA/GGLQRGPGLSPLGAAGKVSCLHPPSMVENN 
DSTCHEHHEGILAARVTPVP\SGKPGRVLKPPGRVCRPPHPAAS 
PRPPGS/SDLDGPRPQMHLRAFPAAHGGPVNTPHGGEEKTFMSS 
QIRRKETKPL*RKTPAG\NNYQSNSIPVSQSPQLTVDLLPSAGR 
TQAPSGRGDAGKPTPGHG\LPKASVILTPNCPCSLAGGQ*PPGL 
YPKTPKQRRWRRPL/LLGPSQ*GSRQSTC+EV\GALGEPVRIPG 
L * PDLS C I LSNGS KHRREGLS FPR SLG PGRRGPAGLQ S LGCS PT 
PKNTACHS SGHVALQAGHDSARDVGSGHVALQAGHDSTQDVGRP 
VWRWIPLE * LGLSRETGQATRRGLVWISPGRAAAACVACAQALE 
EGPLRLPGQDRGAQPCSHCPGRAAGQPEPGAGAPCRE/GG*DPT 
GLT/ GVPGTDPKRGGRKPGQSGQETOX3PTVWSGPES PLQPKP * E 
RQE / VGAGAS SGVGLSRGRAGGPSSAWEVAAMLLLLRHGSHSEL 
TDLTEAQTSQH 


5716 


1711 


1370 


RVFS LLCEGPGHC YQGAVCREACAAAS PGLDS AAE PHRLCEHTD 
*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYAC* 
RCPLVL* SGFFTI I VGG YSCCMPLKT 


5717 


44 


1489 


LPTEALRESEWVSEYGKCGPRGLVPEGESTSPLPSSVDTEDSLD 
EG PGALVLE SD LLLGQDLE F EEEEEEEEGDGNS DQLMG FE RDSE 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M^Methionine, Nt=Asparagine , 
P=Proline, Q«Glutamine, R=Arginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon. /=Dossible nucleotide del pt- ion 
\=possible nucleotide insertion) 








GDSLGARPGLPYGLSDDESGGGRALSAESEVEEPARGPGEARGE 
RPGPACQLCGGPTGEGPCCX5AGGPGGGPLLPPRLLYSCRLCTPV 
SHYSSHLKRHMQTHSGEKPFRCGRCPYASAQLVNLTRHTRTHTG 
EKPYRCPHCPFACSSLGNLRRHQRTHAGPPTPPCPTCGFRCCTP 
RPARPPSPTEQEGAVPRRPEDALLLPDLSLHVPPGGASFLPDCG 
Q\CGVKGRASAGLDQNHCQS/SLFPWTCRGCX3QELEEGEGSRLG 
AAMCGRCMRGEAGGGASGGPQGPSDKGFACSLCPFATHYPNHLA 
RHMKTHSGEKPFRCARCPYASAHLDNLKRHQRVHTCEKPYKCPL 
C P YACGNLANL KRHGR IHSGDKP FR CSLCNY S CNQSMNL I RHM 


5718 


12 0 


284 


VAHAL SL P AE S YGND VS MTHPQLP PTQLAWDLCRTCL PI»S YNFT 
S**STADPLHL 


5719 


48 


428 


ELNNGPFQMPLCNGGNLAVTGS WADRS PLHEAAS QGRLLALRTL 
LSQGYNWAVTLDHVTPLHEACLGDHVACARTLLEAGANVNAIT 
IDGVTPLFNACSQGSPSCAELLLEYGAKAQP\ESCLPSP 


5720 


1 


1051 


LQAFRNAS EVPMVLVGTQDAI SAA\NPRVYRRTSRARKLSTDLK 
\ RCT\ YYE \ TCGGTYGIiOMWSVSFODVAOKWaTA RTf T.a T 
GPCK\ SLPN\ S PSH\ SAVSAAS I PARAPINQGHE/SGGGSAFSD 
Y\SSSVPSTPSISQRELRIETIAASSTPTPIRKQSKRRSNIFTS 
RKGADP\DRE KKAAGCKVDS IGSGRAI P I KQG I LLKRSGKS LNK 
EWKKKYVTLCDNGLLTYHPSLHDYMQNIHGKEIDLLRTTVKVPG 
KRL PRATPATAPGTS PRANGLS VERSNTQLGGGTGAPHS AS S AS 
LHSERPLSSSAWAGPRPEGLHQRSCSVSSADQWSEATTSLPPGM 
QHPASG 


5721 


97 


492 


RHSSPCCSLRRTRR^^NAAV^T /TTVOOPKPl?>T^kivbpUTnr i ira ' 
VFYAIAGGLFLERAYYYAFAAHHTGITDTTRVGI ILSRGTAAS I 
S FMFS Y I LLTMCRNL I T FLRETFLNR YVPFD AAVDFHR L IAS T A 


5722 


88 


1043 


VALDVLAGS S PGGGMAGALLG PRVHG I RAVLRVARGGVQAPGAP ' 
GSLGVSHAAAPPARPQGAAQSPHRGRRHGGGGAGLPPPRSPRFP 
QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 
GRARGQAGLLGRQGQGGRGAERERAALQARRGRRPG PEPDQSCG 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 
PPPPPHIiGAIjTAR<3nFFRO<?OPPZkPTT.PT.f5Prtap'r.p\ DDaconr' 

RPKQAEQQQ\PKRPTPPARGPQSSGDPAMLPQRAGLRTGGLAGT 
KSSTREIPEMI 


5723 


88 


1043 


VALDVLAGSS PGGGMAGAiLG PRVHGIRAVIiRVARGGVQAPGAP 
GS LG VSHAAAP PARP QGAAQS PHRGRRHGGGGAGLPPPRSPRFP 
QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 
GRARGQAGLLGRQGQGGRGAERERAAXQARRGRRPGPEPDQSCG 
GRPRRAAAAPGRAPADPQP PAPRPAPAPD VRP PADAPAPAPAPA 
PPPPPHLGAIiTAGSGEERQSQPRAETLRLGRGAPLP\PRAERGG 
RPKQAEQQQ\ PKRPTPPARGPQS SGDPAMLPQRAGLRTGGLAGT 
KSSTREIPEMI 


5724 


3 


1641 


FTNEAP PAPLP DAS AS PLS PHRRAKS LDRRS TE P S VTP DLLNFK 
KGWLTKQYEDGQWKKHWFALADQSLRYYRDS VAEEAADLDGE ID 
LS AC YDVTE YP VQRNYGFQ I HTKEGE FTLS AMTS G I RRNWI QT I 
MKHVHPTTAPD VTS SLPEE KNKS S CS FET CPP PTRTrr>F A PT.CPP 
DPEQKRS RARE \ RRREGRS KTFDWAE FRP I QQALAQERVGGVG P 
ADTH\DPWRPFJu^HGELERERARRREERRKRFGMLDATDGPGTE 
DAALRMEVDRSPGLPMSDLKTHNVHVEIEQRWHQVETTPLREEK 
QVPIAPVHLSSEDGGDRLSTHELTSLLEKELEQSQKEASDLLEQ 
NRLLQDQLRVALGREQSAREGYVLQATCERGFAAMEETHQKKIE 
DLQRQKQRELEKLREE KDRLLAEETAAT I S AI EAMKNAHREEME 
RELEKSQRSiQISSVNSDVEALRRQYLEELQSVQRELEVLSEQYS 
QKCLENAHLAQ ALEAERQ ALRQ CQRENQ E LNAHNQE LNNRLAAE 
ITRIiRTLLTGDGGGEATGSPLAQGKDAYELEVPSGARPCLTQLC 
TQE PQGSAAWPLS YR WGGTDLRQQESQG PGRS KS P EGGEEQ 


5725 


3 


1049 


VNGHS EE T S QS PNRTE PHDSDCS VDLG I S KS TE DLS PQ KS GP VG 
SWKSHSITNMEIGGLKIYDILSDN\DLSSHLQPLK/FTSAVDG 
KNIVRSKAATLLYDQPLQVFTGSSSSSDLISGTKAIFKFDSNHN 
PE/GAiCYNKRPHKWAHNLHLKYMVLHSI ISNTVAV\RSQRHFVA 
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Amino acid segment containing signal peptide 
{A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine , M«Me thionine , N^Asparagine , 
P=.Proline, Q«Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine ( 
W= Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQTKS PNRPCQFSSS APS / VDQRAQ/ INQS YAKHSANMNFSNHN 
NVRANTAYHLHQRLG PARHGEMWAI S PNDRLI PAVTRS TIQRQS 
SVSSTASVNIiGDPGSTRRAQIPEGDYLSYREFHSAGRTPPMMPG 
SQR P L S ART YS I DGPNAS RPQS ARP S INE I P ERTMS VS DFN YS R 
TSP 


5726 


2 


486 


SRSLSMWWNSGLPASSHSSKLPVTVGFSGCVKRLRLKGRPLGAP 
TRMAGVTPCILGPLEAGLFFPGSGGVITL/ESVGAGIPGPSRAG 
QGSPGGSGEGPPLSSPSQPLPADLPGATLPDVGLELEVRPLAVT 
GLI FHLGQARTPP YLQLQVTEKQVLLRADDG 


5727 


21 


221 


RPILILKETRRLPWATGYAEVINAGKSTHNEDQASCEVLTVKKK 
AGAVTST PNRNS S KRRS S L PNGE 


5728 


2 


877 


GTRNGQFEPRRGRAWEGSAGGLRAPGAAAGGPGVQPRGSG/LPG 
NAIRAGVNPGRGPASPFWDLSLPWDLWPPPTDHAPGAPDFPAVE 
GR\PWAGGRPPWPVSGVLGSRVCGPLYSTSPAGPG/SGGLSPSQ 
GGPAGAGGDAG/LPGRCPSAPWRAGSRPAASCPDWIPGPQGLWL 
HRNPTS/GPPSQIGEGAEQGDEGVADAPQIQCKN/GAEDPPAED 
E P PQ VP EAGE E D AVPAEEG PGGT P ETQADQVRER P E AHLAEGGA 
KGS PRRLADPQDLPAGQMS LAPP FP PVAAVIRSNK 


5729 


1 


1525 


AGGARE VLTLQ LGHFAG FVGAHW WNQQDAALGRAT DS KE PPGEL 

QLDAAIAWQGKLTTHKEELYPKNPYLQDFLSAEGVLSSDGVWRV 
KS I PNGKGSS PLPTATTPKPLIPTEAS I RVWSDFLRVHLHPRSI 
CMIQKYNHDGEAGRLEAFGQGESVLKEPKYQEELEDRLHFYVEE 
CD YLOG FO I LCDLHDG FSG VGAKAAE L LODF Y <3 pip rjTT TWP1T . T .t> 

GPYHRGEAQRNIYRLLNTAFGLVHLTAHSSLVCPLSLGGSLGLR 
PEPPVS FPYLHYDATLPFHCSAILATALDTVTCS \ YRLCSS PVS 
MVHL\ ADMLS FCGKKWTAGAI I P FPLAPGQSLPDSLMQFGGAT 
PWTPLSACGEPSGTRCFAQSWLRGIDRACHTSQLTPGTPPPSA 
LHACTTGEE I LAQ YLQQQQ PG VMS S S HLLLTP CR VAPP Y PHL FS 
SCSPPGMVLDGSPKGAAVESVPVFG 


5730 


1258 


1713 


KKFQAPARETCVECQKTVYPMERLLANQQVFHISCFRCSYCNNK 
LS LGT YAS LHGRI YCKPHFNQLFKS KGN YDEG FGHR PHKDLWAT 
KIETEGFWERPRNFENQGRPLKSPGGEDCPSC*GGCPGSNY*AQ 
GSS SREKGGQAS WNPKLRVA 


5731 


122 


443 


RSHRGE L I P KD S C YMRKP PRR P KKRRQG / CAL PQGC LTF KD VAI 
E FS LEE W KCLNPAQRAL YRAVMLEN YRNLES VGLTS KDSWYMRK 
KPGRGRGKQRRQEWFFLRVY 


5732 


226 


772 


PPSRSCQSPRRKSRRRAHVTVTLVCGFTSFSFSLPLYLCGCLRF 
PERTCS QLQQAD WAPDFGPS S F VPS WGATATGARKFL I AFN I \N 
LLGTKEQAHR IALNLREQGRGKDQ PGRLKKVQG I G W YLDEKNLA 
Q VS TNLLDFEVTALHT VYEETCREAQELS LPWGSQLVGLVP LK 
ALLDAA 


5733 


1 


460 


PALQEVNANALAWGKQ YENDARTLFE FTSGVNDTES PI I YRDES 
MRTACS PDGLCSDGNGLiETj'KTP FT^R f)FMIf PR T .rSflFT? h t k q a vm 
AQVQYSMWVTRKNAWYFANYDPRMKREGLHYWI ERDEKYM \AS 
FDEI \VP\EFIGKMDEVLSRDPM 


5734 


3 


968 


RCNSPESLTSLLVLLTTANNLFVLIPAYSKNRAYAIFFIVFTVI 
GS L FLMNLLTAI I YSQ FRG YLMKSLQTS L FRRRLGTRAAFE VL S 

GSVLLSAEEFQKLFNELDRSWKEHPPRPEYQSPFLQSAQFLFG 
HYYFDYLGNLIALANLVSICVFLVLDADVLPAERDDFILG1LNC 
VFIVYYLLEMLLKVFAIjGLRGYLSYPSNVFDGLLTVVLLVLEIS 
TL \ VCTDCHTQAGGRRWW / RLLS LWDMTRMLNML I VFRFLR IIP 
SMKPMAWAS TVLGL 


5735 


2 


540 


FFTPCVARAFNFPDQATVKKAAYSLPRVGGGTSCGLPQARRISL 
ATPRQLYK/SSNMTQRWQRREISNFEYLMFLNTIAGRTYNDLNQ 
YPVFPWVLTNYESEELDLTLPGNFRDLSKPIGALNPKRAVFYAE 
RYETWEDDQS PP YH YNTHYSTATSTLSWLVRI VS I FIELACLWY 
LKILT 


5736 


1 


382 


gtrpstkksgyspqqvavihckghqkentavahsnqkadsaaqV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=»Asparagine , 
P=Proline, Q=Glutamine, R*=Arginine, 
S=Serine, SVThreonine, V=Valine, 
W«* Tryptophan, Y^Tyrosine, X= Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TARLSVTPPNLLPTVSFPQPDLPDNPVYSTTTEKLASDLRANKN 
QES * * I LPDSGI F I P * T* TS YLQSTTHLRRAKLPQLLRR 


5737 


290 


1041 


KACLHLLS S F LTSNFLFNPLL P DS LYS VE ARS QRANLGP CRRKR 
LQTLMRLAAGFQYSSHKDPSLSAKEKHTDYHNEARGPWPGWVG* 
RTADGSCGRGPDGAHHPGPKSSSWRASRLLPGLGGSHHLDAYVG 
RDLECGTPAP LQLE I P PQ PRGHPAP I PTGQAG PRDS G PGAS P * V 
ETRPLTDGRR*PGVRPVGWTPAHPAGTLRPRGAVEPSVSACGKW 
APS PTSQGCCEGRCDAVPKHRAWRT PLCS Q 


5738 


8 


460 


DTLSLNCTLPETLPMTPSF*LSFL*FPGLARAKSIPTKTYSNEV 
VTLWYRPPD I LLGSTDYSTQ I DMW * GQVEVWQGPCGKGGGLVTT 
ATQPAAFLFTVPSLPRGVGCIFYEMATGRPLFPGSTVEEQLHFI 
FRILSE E AW ALCAVE THR 


5739 


1 


1222 


SFQRRGIRWNVHTLHPHPRAVWAGIGRGHGS*ALLGRARAPALC 
FPTLLEFLESLEPDLPALRAMGLHLWAAGPGTHPAGISDLLAEV 
S AE VDGP VPG YLSS POS I TDTCI/YT FTSGTTCJT .P V A ABTQUT.VT 

LQCQGFYQLCGVHQBDVIYLALPLYHMSGSLLGIVGCMGIGATV 
VLKSKFSAGQFWEDCQQHRVTVFQYIGELCRYLVNQPPSKAERG 
HKVRIAVGSGLR PDT WERFVRR FG PLQVL ET YGLTEGNVATINY 
TGQRGAVGRASWLYKH I FPFSL I RYDVTTGE P I RDPQGHCMATS 

PGEPGLLVAPVSOOSPFIjGYARr;PFT,nnnKT.T,K"nVT7T5 DHn , 57"G , T7M 

TRDLLVCDDQGFLRFHDRTGDPFRWKGENVATTEVAEVFEALDF 
LQEVNVYGVTV 


5740 


265 


231 


PAYWLKVPTLCLESKTDLREKASHVSAQLQGEVRGLAGALWM*A 
YVYERVTO*NISRMVHALEQKRHPAGLSSSMALQLNPCLGMLMA 
LQSELHKLYDEETQSWVSGSACGGYP 


5741 


1 


650 


PRKTMRRGVLMTLLQQS AMTLPLW I G KPGDRP P PLCGA I P ASGD 
YVAR PGDKVAAR VKAVDGDEQW I LAE WS YSHATNKYE VDD I DE 
EGKERHTLS RRRVI PLPQWKANPETD P EALFQKEQLVLAL YPQT 
TCF YRALIHAPPQRPQDD YS VLFEDTS YADG YS P PLNVAQRYW 
ACKEPKKK*CRLADSPSPNDTGQDSRGRAG2 KHI PPLKKK 


5742 


2 


362 


TQSVKEILKRNPNVNLTDKDGNTALMIASKEGHTEIVQDLLDAG 
TYVNI PDRSG DT VL IGAVRGGHVE I VRALLQ KYAD I D I RGQDNK 
TAL YWAVEKGNATMVRDI LQCNPDTE I CTKDG 


5743 


2 


415 


GKTPEGIDAIEEIEIDLEETEREISPQENGLEEVKPLGEMQTDL 
KATGRE ISPREKTPEVIDATEEIDKDLEETGRRE IS PEENGPEE 
VKPVDEMETDLKTTGREGSSREKTREVIDAAEVIETDLEETERE 
ISPQE 


5744 


3 


703 


TRRTTTTSPTTTRQMTTTPAALPTTWTTPDLTTGTPLQMTTIA 
VFTTANTCLSLTPSTLPEEATGLLTPEPSKEGPILTAESETVLP 
SDS W S S AE STS ADTVL LTS KES KVWDLP S TSHVSM WKTS DS VS S 
PQPGASDTAVPEQNKTTKTGQMDGIPMSMKNEMP ISQLLMI IAP 
SLGFVL FALFVAFLLRGKLMET YCSQKH TRLD Y I GDS KNVLNDV 
QHGREDEDGLFTL 


5745 


1400 


599 


GKSRFVNLMKHSKKTYDSFQDELEDYIKVQKARGLEPKTCFRKM 
KGDYLETCGYKGEVNSRPTYRMFDQRLPSETIQTYPRSCNIPQT 
VENRLPQWLPAHDSRLRLDSLSYCQFTRDCFSEKPVPLNFNQQE 
YI CGS HGVEHR VY KHFS SDNS TS THO A SWKO THO K"P TO? UD"B T?n d 
E KS EEERS KHKRKKS CEE I DLD KHKS I QRKKTEVE I ETVHVS TE 
KLKNRKEKKSRDWSKKEERKRTKKKKEQGQERTEEEMLWDQSI 
LGF 


5746 


3 


821 


SFASGRLTPSSPAFDGELDLQRYSNGPAVSAWSLGMGAVSWSES " 
RAGERRFPCPVCGKRFRFNSILALHLRTHQPERPRSPAARLLLE 
LEE RALLRE ARLGRARS S GGMQAT PATEGLARPQAP S SSAFRC P 
YCKGKFRTS AERERHLH I LHRPWKCGLCS FGS SQEEEL LHHS LT 
AHGAPERPLAATSAAPPPQPQPQPPPQPEPRSVPQPEPEPQPER 
EAT PTPA P AAPEE P PAP P E FRCQVCGQS FTQS WFLKGHMRKHKA 
SFDHACPV 


5747 


2 


1328 


DRHVETLCIHFLGPSTGSTAKTGGRNWLKTGNCLYGNTCRFYHG 
PSPRGKGYSSNYRRSPERPTGDLRERIKNKRQDVDTEPQKRNTE 
ESSSPVRKESSRGRHREKEDIKITKERTPESEEENVEWETNRDD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location . 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" - 
(A=Alanine, C=Cysteine, D= As par tic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I,ysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








S DNGD INYD YVHELS LEMKRQ KI QRELM KLEQENME KREE 1 1 1 K 
KEVSPEWRS KLS PS PS LRKS SKS PKRKS S PKS SSAS KKDRKTS 
AVS S P LLDQQ RNS KTNQS KKKG P RTPS P P P P I PE DI ALGKKYKE 
KY KVKDR I EE KTRDG KDRGRDFE RQREKRDKPRS TS PAGQHHS P 
ISSRHHSSSSQSGSSIQRHSPSPRRKRTPSPSYQRTLTPPLRRS 
ASPYPSHSLSSPQRKQSPPRHRSPMREKGRHDHERTSQSHDRRH 
ERREDTRG KRDRE KD S REEREYEQDQS S S RDHRD DREPRDGRDR 
RE 


5748 


934 


473 


SEGPQVFYKGLAPTLIAIFPYAGLQFSCYSSLKHLYKWAIPAEG 
KKNENLQNLLCGSGAGVI S KTLTYPLDLFKKRLQVGGPEHARAA 
FGQVRRYKGLMDCAKQVLQKEGALGFFKGLSPSLLKAALSTGFM 
FFSYE FFCNVFHCMNRTASQR 


5749 


552 


1 


GFPVDPRVRGSTLSLAERPKGMIRSGSFRDPTDDVHGSVLSLAS 
S AS S T YS S AE ERMQS EQ I RKLRRE LES S QE KVATLTS QLS ANAN 
LVAAFEQSLVNMTSRLRHLAETAEEKDTELLDLRETIDFLKKKN 
SEAQAV IQGALNAS ETT P KELR I KRQNS SDSISSLNSITSHSSI 
GSSKDADA 


5750 


22 


866 


I F I S I CL WNAHLCFLLL P KDC I DQ VMKLQNL FVDDS GR YLA I QF 
HLEWAYVFLYYYE YRKAKDQLDI AKD ISQLQ I DLTGALG KRTRF 
QENYVAQLILDVRREGDVLSNCEFTPAPTPQEHLTKNLELNDDT 
ILNDI KLADCEQFQMPDLCAEEIAI ILGICTNFQKNNPVHTLTE 
VE LLAFTS CLLS QP KFWAI QTS AL I LRT KLE KG S TRRVERAMRQ 
TQALADQFEDKTTSVLERLKIFYCCQVPPHWAIQRQLASLLFEL 
GCTS S ALQ I FE KLE MWE 


5751 


3 


751 


SCGSALRAWRCGAAALATFPAPALPGLMYRALYAFRSAEPNALA 
FAAGETFLVLERSSAHWWLAARARSGETGYVPPAYLRRLQGLEQ 
DVLQAIDRAIEAVHNTAMRDGGKYSLEQRGVLQKLIHHRKETLS 
RRGPS AS SVAVMTS STSDHHLDAAAARQPNGVCRAGFERQHSL P 
SSEHLGADGGLFQI PLPSSQI PPQPRRAAPTTPPPP VKRRDREA 
LMASGSGGHNTMPSGGNSVSSGSSVSSCI 


5752 


3 


471 


GPVCGVGLS VAWAGPWRGP VHSVGGGGRAALHGAEL PCLS GAAT 
VEREMELRHKNEMLRVETEARARAKAERENADI IREQIRLKASE 
HRQTVLES I RTAGTLFGEGFRAFVTDRDKVTATVNI FI KQGWQV 
AERQHVGAS WS PRSCPCRLCTAL 


5753 


34 


463 


DDSXA I PGG VQAP FGAVRN I YTPRTGHR I RKLDQ I QSGGN YVAG 
GQEAFKKLN YLD I GE I KKRPM E WNTEVKP VIHS R INVSARFRK 
PLQEPCT I FLIANGDLINPASRLLI PRKTLNQWDHVLQMVTEKI 
TLRSGAVHRLYTLEGRLV 


5754 


14 


331 


TL VHWE FAGEHAEAI ASREQE VLQGWKEL LSAC E DARLHVS S T 
ADALRFHSQVRDLLS WMDG IASQIGAADKPRCPS S LLGLPAS PW 
WPTPATPS PLTAPFSME 


5755 I 


3 


888 


LGDQFYKEAIEHCRSYNSRLCAERSVRLPFLDSQTGVAQNNCYI 
WM E KRHRG PGLAPGQL YTYP ARCWRKKRRLHPPED P KLRLLE I K 
P E VELPLKKDG FTS ES TTLEALLRGEGVEKKVDAREEES IQE I Q 
R VLENDENVEEGNEE E DLE ED I P KRKNRTRGRARG S AGGRRRHD 
AASQE DHDKP YVCD I CG KR YKNR PGLS YHYAHTHIiAS EEGDE AQ 
DQETRS P PNHRNENHR PQKGPDGTVIPNNYCDFCLGGSNMNKKS 
GRPEELVSCADCGRSAHLGGEGRKEKEAAA 


5756 


3 


621 


SSKLQALFAHPLYNVPEEPPLLGAEDSLLASQEALRYYRRKVAR 
WNRRH KM YREQMNLTS LDPPLQLRL EAS WVQFHLG I NRHG L YS R 
SSPWSKLLQDMRHFPTISADYSQDEKALLGACDCTQIVKPSGV 
HL KLVLR FS DFGKAM F KP MRQQ RDEETP VD FF Y FID FQRHNAE I 
AAFHLDR I LDFRR VP P TVGR I VNVTKE I L 


5757 


3 


473 ■ 


YKDALLLPDNHRQWFENGTLKLTDVQKGMDEGEYLCSVLIQPQ 
LSISQSVHVAVKVPPLIQPFEFPPASIGQLLYIPCWSSGDMPI 
R I TWRKDGQ VI I SGS G VT I ES KE FMS SLQ I S S VSLKHNGNYTC I 
ASKAAATVSRERQLIVRVPPRFW 


5758 


1 


474 


FRRGAGAERGEHREGERGAAGMGEFKVHRVRFFNYVPSGIRCVA 
YNNQSNRLAVSRTDGTVEIYNLSANYFQEKFFPGHESRATEALC 
WAEGQRLFSAGLNGE I ME YDLQALN I KYAMDAFGGP I WSMAASP 
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SEQ 
ID 
NO: 


c i. cax^LCU 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of . 
amino acid 
sequence 


rrcuJLctea ena 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ' 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E*= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=*Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine f R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, YoTyrosine, X=Unk:novm, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 
SGSQLLVGCEDGSVKXiFQITPDKI PV 


5759 


2 


1240 


GNAAFAGQGV V Y ETFHMSDLPS YTTMGT VHVWNNQ IGFTTDPR 
MARS S P Y PTDVAR WNAP I FHVNADDP EAV I YVCS VAAE WRNT F 
NKDVGADLVC YRRRGHNEMDE PMFTQP LM YKQ IHRQ VP VLKKYA 
D KL I AEGTVTLQEFE E E I AKYDR I CE EAYGRS KDKK I LH I KHWL 
DSPWPGFFNVDGEPKSMTCPATGIPEDMLTHIGSVASSVPLEDF 
KIHTGLSRILRGRADMTKNRTVDWALAEYMAFGSLLKEGIHVRL 
NGQDVERGTFSHRIiHVLHDQEVDRRTCVPMNHLWPDQAPYTVCN 
SSLSEYGVLGFELGYAMASPNALVLWEAQFGDFHNTAQCIIDQF 
ISTGQAKWVRHNGIVLLLPHGMEGMGPEHSSARPERFLQMSNDD 
SDAYPAFTKDFEVSQL 


576Q 


1 


1221 


VRDITSDSLSLSWTVPEGQFDHFLVQFKNGDGQPECAVRVPGHED 
GVT I SGLE PDHKYKMNL YG FHGGQRVGP VS AVGLTAPG KDSEMA 
PAS TE P PTP E P P I KPRLE E LTVTDAT PDS LS LS WTVP EGQ FDH F 
LVQ YKNGDGQ P KATR VPGHEDR VTI S GLE PDNKYKMNL YG FHGG 
QRVGPVSAIGVTAAEEETPTPTEPSMEAPEPPEEPLLGELTVTC 
SSPDSLSLSWTVPQGRFDSFTVQYKDRDGRPQWRVGGEESEVT 
VGGLEPGRKYKMHLYGLHEGRRVG P VS T VG VTAPQEDVDET P S P 
TEPGTEAPEPPEEPLLGELTVTGSSPDSLSLSWTVPQGRFDSFT 
VQYKDRDGRPQAVRVGGQESKVTVRGLEPGRKyKMHLYGLHEGR 
RLGPVSAIGVT 


5761 


3 


1275 


SCDMAEAAALVWIRGPGFGCKAVRCASGRCTVRDFIHRKCQDQN 
VPVENFFVKCNGALINTSDTVQHGAVYSLEPRLCGGKGGFGSML 
RALGAQ I EKTTNREACRDLSGRRLRDVNHEKAMAEWVKQQAERE 
AEKEQKRIiERLQRKLVEPKHCFTSPDYQQQCHEMAERLEDSVLK 
GMQAASSKMVSAEISENRKRQWPTKSQTDRGASAGKRRCFWLGM 
EGLETAEGSNSESSDDDSEEAPSTSGMGFHAPKIGSNGVEMAAK 
FPSGSQRARWNTDHGS PEQLQ I PVTDSGRH ILEDS CAELGES K 
EHMESRMVTETEETQEKKAESKEPIEEEPTGAGLNKDKETEERT 
DGERVAEVAPEERENVAVAKLQESQPGNAVIDKETIDLLAFTSV 
AELELLGLEKLKCEbMALGLKCGGTLQ 


5762 


2 


344 


GSTGQT P LHSUGGGGGSGGGRRRTPRGM PKE KYE P PD PRRM Y T I 
MSSEEAANGKKSHWAELEISGKVRSLSASLWSLTHLTALHLSDN 
SLSRIPSDIAKLHNLVYLDLSSNKIR 


5763 


3 


429 


LDKDTGL I ML IARLD YEL I QR FTLT 1 I ARDGGGEETTGR VR I NV 
LDVNDNVPT FQ KDAYVGALRENE PS VTQLVRLRATDE DS P PNNQ 
I T YS I VS AS AFGS YFD I S L YEG YG V I S VS R PLD YEQ I SNGL I YL 
TVMAMDAGN 


5764 


19 


441 


VCARACGEMRQLLRP I DRQR YDENEDLSDVEEI VS VRGFSLEEK 
LRSQLYQGDFVHAMEGKDFNYEYVQREALRVPEjIFREKDGLGIK 
MPDPDFTVRDVKLLVGSRRLVDVMDVNTQKGTEMSMSQFVRYYE 
TPEAQRDKL 


5765 


3 


825 


QKIIiRLNNSHQPPTSSSNSKDCGGPASSGAGATAALADGLkFAS 
VQASAPQGNSHKETS KS KVKRS KTS KDANKS L PS AALYG I PE I S 
STGKRQEVQGRPGEATGMNSALGQSVSSGGSGNPNSNSTSTSTS 
AATAGAGSCGKSKEEKPGKSQSSRGAKRDKDAGKSRKDKHDLLQ 
GHQNGSGSQAPSGGHLYG FGAKSNGGGAS PFHCGGTGSGS VAAA 

GEVSKSAPDSGLMGNSMLVKKEEEEEESHRRIKKLKTEKVDPLF 
TVPAPPPHV 


5766 


1608 


663 


SGLFSVDPASSQAMELSDVTLIEGVGNEVMVVAGVVVLI3LALVL 
AWLS T YVADS GSNQLLGAI VSAGDTS VLHLGHVDHLVAGQGNPE 
PTE L PHPS EGRDEKAE EAGEGRGDSTGEAGAGGG VEPS LEHLLD 
IQGLPKRQAGAGSSSPEAPLRSEDSTCLPPSPGLITVRLKFLND 
TEELAVARPEDTVGALKSKYFPGQESQMKLIYQGRLLQDPARTL 
RSLNITDNCVIHCHRSPPGSAVPGPSASLAPSATEPPSLGVNVG 
SLM VP VFWL LG WWY FR I NYRQ FFTAP ATVS L VG VTV F FS FLV 
FGMYGR 


5767 


2 


892 


NFRATPRPPTRPELRTGTEVILWYLDWRALMKRKRMKANIKLVG 
SGFPLPSSDLDDSLTEEIDEKIGFRNDANFDWQNVADFRDAGGS 
LTEVKVEEEERDPQSPBFEIEEEEEMLSSVIPDSRRENELPDFP 
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SEQ 
ID 
NO: 


rrculCUBu 

beginning 

nucleotide 

location 

Coftp snnnri H nrr 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q-Glutamine, R=Arginine, 
S«Serine, T=Threonine, V=Valine, | 
W=Tryptophan, YsTyrosine, X=Unknown, *=Stop 
Luuon, /-possioic nucleotide deletion, 
\=possible nucleotide insertion) 








HI DE FFTLNSTPSRSA YDEPHLL VNI E KQKLELEKRR LDI BAER 
LQ VE KERLQ I EKERLRHLDM EHE RLQLE KE RLQ IE RE KLRLQ I V 
NSEKPSLENELGQGEKSMLQPQDIETEKLKLERERLQLEKDRLQ 
r Jj l\.r aolSxujy JL c*Kc»KLiQVBKDR1jR IQKEGHLQ 


5768 


3 


476 


SSRSRLSVSVSPPPPGIVELGPPFAWEFCSRLGSAVTSQRAGPA 
AAMVAKDYPFYLTVKRANCSLELPPASGPAKDAEEPSNKRVKPL 
SRVTSLANLIPPVKATPLKRFSQTLQRSISFRSESRPDILAPRP 


5769 


38 


667 


TKTKKGVKEKATDQSVKAFAEHCPELQYVGFMGCSVTSKGVIHL 
TK1jRNLSSLDLRHITELDNETAMEIVK3?CKNLISIiNLCLiNWIIN 
DRCVEVIAKEGQNLKELYLVSCKITDYALIAIGRYSMTIETVDV 
GW CKE I TDQG ATLI AQS S KSLR YLGLMR CDKVNE VT VEQL VQQ Y 
rn 1 1 r o 1 V liy DLKK 1 LiERAYQMGWT PNMSAAS S 


5770 


1 


484 


DSRRYDVKTRKWSFLLEEHSKLIAKVRCLPQVQLDPLPTTLTLA 
r ASyjjKKi s IjS LTPDVPEADLS EVD PKLVSNLMP FQRAGVNFAI 
AKGGRLLLADDMGLGKT IQAI C I AAFYRKE WPLLVWPS S VRFT 
WEQAFLRWLPSLSPDCINVWTGKDRLTA 


5771 


168 


741 


GLLPSACLRARSWREASEGPSSRACSNGSQDTFEACYSGTSTPS 
FHGSHCSGS DHS S LGLEQLQD YM VTLRS KLGPLE I QQ FAMLLRE 
YRLGLPIQDYCTGLLKLYGDRRKFLLLGMRPFIPDQDIGYFEGF 
LEGVGI REGG I LTDSFGRI KRSMS STSASAVRS YDGAAQRPEAQ 
AFHRLLAD I THD I E 


5772 


148 


383 


EFNLALVSPSHPQIKAEDDQPLPGVLLSLSGGLFRSNLLTQDNG 
ILTFSNL»VTCS AI YHLPVFPERE PGCSMRDLRVA 


5773 


2 


723 


PR VRS KHNF C FMEMNTRLQ VEHP VTEM I TGTDLVE WQLR I AAGE 
KIPLSQEEITLQGHAFEARIYAEDPSNNFMPVAGPLVHLSTPRA 
DPSTRIETGVRQGDEVSVHYDPMIAKLVVWAADRQAALTKLRYS 
LRQ YNI VGLHTN IDFLLNLSGHPEFEAGNVHTDF I PQHHKQLLL 
S RKAAAKESLCQAALGL IL KEKAMTDTFTLQAHDQFSPFSSS SG 
RRLNISYTRNMTLKDGKNSK 


5774 


0 


592 


FVEE ENI RWRCGGSELNFRRAVFSADS KYI FC VSGDFVKVYS T 
VTEE CVHI LHGHRNL VTG I QLNPNNHLQL YS CS LDGTI KLWD Y I 
DGIL I KTFI VGCKLHALFTLAQAEDS VFVIVNKE KPDI FQLVS V 
KLPKSSSQEVEAKELS FVLDY INQS PKCI AFGNEGVYVAAVREF 
iJjo V i b r K.Kiii I TSRVTLSSS 


5775 


3 


538 


SSGCCDPAAPSSLAEAATMPVSKCPKKSESIjWKGWDRKAQRNGL 
RSQ VYAVNGDYYVGEWKDNVKHGKGTQ VWKKKGA I YEGDWKFGK 
RDGYGTLSLPDQQTGKCRRVYSGWWKGDKKSGYGIQFFGPKEYY 
EGDWCGSQRSGWGRMYYSNGDIYEGQWENDKPNGEGMLiRLSQNP 
RP 


5776 


2 


484 


RLPQDCVCQNLSESLGTLCPSKGLLFVPPDIDRRTVELRLGGNF "" 
IIHISRQDFANMTGLVDLTLSRNTISHIQPFSFLDLESLRSLHL 
DSNRL PS LG EDTLRGLVNLQHL I VNNNQLGG I ADEAFEDFLLTL 
EDLDLS YNNLHG PAVGLRGDAW VQPSTS 


5777 


2 


949 


GQDPEPGQDLFQPEREVDPSWGRGREPRLGKLRFQNDHLSVLKQ 
VKKLEQALKDGS AGLD PQLPGTCYS PHCPPDKAEAGS TLPENLG 
GGSGSEVSQRVHPSDLEGREPTPELVEDRKGSCRRPWDRSLENV 
YRGSEGS PTKPF INPL PKPRR TFKHAGEGDKDGKPG I GFRKEKR 
NLPPLPSLPPPPLPSSPPPSSVNRRLWTGRQKSSADHRKSYEFE 
DLLQSS SES S RVDW YAQTKLG LTRTLS EENVYEDI LDPPMKENP 
YED I ELHGRC LGKKCVLNFPAS PTS S I PDTLTKQS LS KP AFFRQ 
NSERRNV 


5778 


1 


1210 


QRRQSVSRLLLPVFLLEPPAEPGLEPPPEEEGGEPAGVAEEPGS 
GGP CWLQLE E VPGPGPLGGGG PLRS PSS YSS DE LS PGE PLTS P P 
WAPl^APERPEHLLNRVLERLAGGATRDSAASDILLDDIVLTHS 
LFLPTEKFLQELHQYFVRJ^GMEGPEGI^RKQACI^LJ^FLDT 
YQGLLQEEEGAGHI I KDLYLL I MKDESLYQGLREDTLRLHQLVE 
TVELKI PE ENQP PS KQVKPLFRH FRR IDS CLQTRVAFRGS D E I F 
CRVYMPDHS YVTIRSRLSASVQD I LGSVTEKLQ YSEEPAGREDS 
LILVAVSSSGEKVLLQPTEDCVFTALGINSHLFACTRDSYEALV 
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Amino acid segment containing signal peptide 

(A=Alanine, C=Cvsteine DsAfinarhie Anid 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
HsHistidine, I=Isoleucine, K«Lysine, 
L^Leucine, M«Methionine, N=Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLPEEIQVSPGDTEIHRVEPEDVANHLTAFHWELFRCVHELEFV 
DYVFHGE 


j 5779 


138 


1671 


EAVQVLIKHSADVNARDKNWQTPLHVAAANKAVKCAEVIIPLLS 
S VNVS DRGGRTALHHAAIiNGHVEMVMTjT .t, Attn am Tim a irn vvdx} v 

ALHWAAYMGHLDVVALLINHGAEVTCKDKKGYTPLHAAASNGQI 
NWKHLLNLGVEIDEINVYGNTALHIACYNGQDAWNELIDYGA 
NVNQPNNNGFTPLHFAAASTHGALCLELLVNNGADVNI QS KDGK 
S PLHMTAVHGRFTRS OTL I ONGGE I DCVDKDGKTTPTiFVA AP YHH 
ELLINTLITSGADTAKCGIHSMFPI^IAALNAHSDCCRKIiLSSG 
QKYSIVSLFSNEHVLSAGFEIDTPDKFGRTCLHAAAAGGNVECI 
KLLQSSGADFHKKDKCGRTPLHYAAANCHFHCI ETLVTTGANVN 
ETDDWGRTALHYAAASDMDRNKTILGNAHDNSEELERARELKEK 
EATLCLEFLLQNDANPS IRDKEGYNS IHYAAAYGHRQCLELLLE 
RTNSGFEESDSGATKSPLHLAVSEMP 


5780 


154 


624 


QFFRVITCLPFKGPDYRLYKSEPELTTVAEVDESNGEEKSEPVS 

DLRTERPRSAVEQLCLAESTRPRMTVEEQMERIRRHQQACLREK 
KKQLNVI GASDQSPLQS PSNTjRDNP 


5781 


19 


941 


RGSLGGHPWRPPMRAASQGCLPVSFVTGPHQERAYGGRGPGGAF 
PAPPVSGTCPPDLIYAPTPEKAEGGSQKNHQPPPGERAAHRDGE 
QAPCRAGPTRKVAVAPRPPSCP*GPE\PGEEPRRPLDRSPPLGQ 
VQPHFTSQDAKSAEDEAPSRHLGKHQPRSAQVGSRLDALQGPKT 
QHSIHTVTCKSPRQKEDRSPKPPQAPKHPEEHGRQS\QAPPPLP 
VAPSRTCGGC*TWDPALLVSP/PQGDSTPELPAP\QQPTGGPSR 
CRQALPPQG*RQQPRQRPR/PTGASRSHPAKAKGCQGPPKIRNY 
NIMD 


5782 


5176 


1237 


DRSMMS MAADS YTDS YTDT YTEA YM VP PLP PEEP PTMP P LP P E E 
PPMTPPLPPEEPPEGPALPTEQSALTAENTWPTEVPSLPSEESV 
SQPEPPVSQSEISEPSAVPTDYSVSASDPSVLVSEAAVTVPEPP 
PEPES S I TLTPVESAWAE EHEWPERP VTCMVSETPAMS AEPT 
VLASEPPVMSETAETFDSMRASGHVASEVSTSLLVPAVTTPVLA 
ESILEPPAMAAPESSAMAVLESSAVTVLESSTVTVLESSTVTVL 
EP9WTVPEPPWAEPDYVTIPVPWSAIiEPSVPVLEPAVSVLQ 
PSMIVSEPSVSVQESTVTVSEPAVTVSEQTQVIPTEVAIESTPM 
ILESSIMSSHVMKGIN^SGDQNLAPEIGMQEIALHSGEEPHAE 
EHLKGDFYESEHGINIDLNINNHLIAKEMEEINTVCAAGTSPVGE 
IGEEKILPTSETKQRTVLDTYPGVSEADAGETLSSTGPFALEPD 
ATG\TSKGI EFTTASTLSLVNKYDVDLSLTTQDTEHDMLISTS P 
SGGSEADIEGPLPAKDIHLDLPSNINLVSSDTNEPLPVKRD\DQ 
T1AALI\SLKESSGGEKEVPPPS*REHLPDSGFSANIEDINEAD 
LVRPVSSPRTWNVLPSPRAGL\EGP\LLASDFGPVQNLYSSPW 
\SSMP\ERASGS\SSGEKGG\YEIFVKVKDTHEKSKKNKNRDKG 
EKEKKRDSSLRSRSKRSKSSEHKSRKLTSESRSRARKRSSKSKS 

HRS \QTRS r s RS /rdrrrrs s rs rs ksrgrrs vs kekrkrs p kh 

RSKSRERKRKRSSSRDNRFCTVRARSRTPSRRSRSHTPSRRRRSR 
SVGRRRSFSISPSRRSRTPSRRSRTPSRRSRTPSRRSRTPSRRS 

DT'PQRPflRTP/?RRPPQ13Q\n7PPPQPQTQD17TJT DDCDTOT ODDCO 
*rOr\.rtoi\. i trot\.x\nt\Oi^o V VnKKor o±£>r VivJjKKoxli fJjKKKf a 

RS P I RRKRS RS S ERG RS P KRLTDLD KAQ LLE I AKANAAAMCAKA 
GVPLPPNLKPAPPPTIEEKVAKKSGGATIEELTEKCKQIAQSKE 
DDDVIVNKPHVSDEEEEEPPFYHHPFKLSEPKPIFFNLNIAAAK 
PTPPKSQVTLTKEFPVSSGSQHRKKEADSVYGEWVPVEKNGEEN 
KDDDNVFSSNLPSEPVDISTAMSERALAQKRLSENAFDLEAMSM 
LNRAQERI DAWAQLNS I PGQFTGSTGVQVLTQEQLANTGAQAWI 
KKDQFLRAAPVTGGMGAVLMRKMGWREGEGLGKNKEGNKEPILV 
DFKTDRKGLVAVGERAQKRSGNFSAAMKDLSGKHPVSALMEICN 
KRRWQP PE FLLVHDSGPDHRKHFLFRVL I NGSAYQPNCM FFLNR 
Y 


5783 


1693 


698 


DSGLRVAFTMEGISNFKTPSKLSEKKKSVLCSTPTINIPASPFM 
QKLGFGTGVNVYLMKRS PRGLSHS P WAVKKINP I CNDHYRS VYQ 
KRLMDEAKILKSLHHPNIVGYRAFTEANDGSLCIiAMEYGGEKSL 
NDLIEE/PI*SQ/PKILFQQP/LILKVALNMARGLKYLHQEKKL 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q«Glutamine, R^Arginine, 
SaSerine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHGD I KS SNWI KGDFET I KI CDVGVSLPLDENMTVTDPEAC Y I 
GTEPWKPKEAVEENGVITDKAD I FAFGLTLWEMMTLSIPHINLS 
NDDDDEDKTFDESDFDDEAYYAALGTRPPINMEBLDESYQKVIE 
LFS VCTNEDP KDRPSAAHIVEALETDV 


5784 


2669 


1388 


P R VRPR VRTDHNY Y I S R I YG P S DS ASRDL WVNIDQM E KD KVK IH 
G I LSNTHRQAARVNLS FDFP FYGHFLRE I TVATGGF I YTGEWH 
RMLTATQ Y I APLMANFDPSVSRNSTVRYFDNGTALWQWDHVHL 
QDNYNLGS FT FQATLLMDGRI I FGYKE I PVLVTQ I S STNHPVKV 
GLSDAFVWHRIQQIPNVRRRTIYEYHRVELQMSKITNISAVEM 
T P LPTCLQFNRCG PCVS S QI G FNCS WCS KLQR CS SG FDRHRQDW 
VDSGCPBESKEKMCENTEPVET\FLEPPQP*ERQPPSSGS*LPP 
E/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
HAGLI VG I L I LVL I VATAIL VT VYM YHH PTSAAS I FF I ERRPSR 
WPAMKFRRGSGHPAYAEVEPVGEKEGF IVSEQC 


5785 


2669 


1388 


PR VR PR VRTDHNYY I S R I YG PS D S ASRDLW VN I DQME KD KVK IH " 
G I LSNTHRQAARVNLS FDFPFYGH FLRE I TVATGGFI YTGEWH 
RMLTATQY I AP LMANFD PS VSRNS T VR YFDNGTALWQWDHVHL 
QDNYNLGS FTFQATLLMDGR 1 1 FGYKE I PVLVTQ IS STNHPVKV 
GLSDAFVWHRIQQIPNVRRRTIYEYHRVELQMSKITNISAVEM 
TPLPTCLQFNRCGPCVSSQIGFNCSWCSKLQRCSSGFDRHRQDW 
VDSGCPEESKEKMCENTRPVFT\FT.PPPriP*FT?m>DQ<:r'c*T dd 

E/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
HAGLI VGILILVLIVATAILVTVYMYHHPTSAASIFFI ERRPSR 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEQC 




2555 


1674 


S YKLPAAB RRAS S CS QP PT PTRRRW PAPGRTS RGHRPQM * SGTP 
APRPPARSTVSPASPLPKPRAGRCGSRPRSACSTFRPC*SLN*M 
S*H+KRNLSQRSSSMSRRPLSCARPHR**RQGLTVAARLPTWAK 
SPPLACSFCQAAQKSQSLSSGRSTR*PERMSFRP\SPPGNPAIP 
SLAPSSRP/PKGRPQCTWIPSRWPASPTAPPTTT*APTSSPGST 
GRSMMTCPTRWTATPWSARASSRPRNWPTP*WRPSGRLSTV*RA 
TGGSTATAPPKRFPRNWNPMMAE 


"" 5787 


2 


1460 


MAS AAS VTS LADE VN C P \ I CQGTL KE AGS LSNCG / HKN FCRACL 
T\RYCEIP\GPDVLEESP\ TrP\TiPK'FPTrRp\rtcppT>Mwn , r a\r\/ 
VEN I E R LQLVSTLGLGEE D VCQEHG E KI YFFCEDDEMQLC WCR 
EAGEHATHTMRFLEDAA\APYREQIHKCLKCLI KEREE IQE IQS 
RENKRMQVLLTQVSTKRQQVISEFAHLRKFLEEQQSILLAQLES 
QDGD I LRQRDE FDLLVAGE I CRFSAL I EELEEKNERPARELLTD 
IRSTLIRCETRKCRKPVAVSPELGQRIRDFPQQALPLQREMKMF 
LEKLCFELDYEPAHISLDPQTSHPKLLLSEDHQRAQFSYKWQNS 
PDNPQRFDRATCVLAHTGITGGRHTWVVSIDLAHGGSCTVGVVS 
EDVQRKGELRLRPEEGVWAVRLAWGFVSALGSFP\TRLTLKEQP 
RQVRVS LD YE VGWVTFTNAVTREP I YTFTAS FTRKVI PFFGLWG 
RGSSFSLSS 


5788 


2 


6860 


EHSVSGRSSAYGDATAEGHPAGPGSVSSSTGAISTTTGHQEGDG "' 

SEGEGEGETEGDVHTSNRLHMVRLMLLERLLQTLPQLRNVGGVR 

AI P YMQVI LMLTTDLDGEDEKDKGALDNLLSQL I AELGMDKKDV 

SKKNERSALNEVHLWMRLLSVFMSRTKSGSKSS ICESSSLI SS 

ATAAALLSSGAVDYCLHVLKSLLEYWKSQQNDEEPVATSQLLKP 

HTTSS PPDMSP FFLRQYVKGHAADVFEAYTQLLTEMVLRLPYQI 

KKITDTNSRI PP PVFDHSW FYFLSE YLM IQQTP F VRRQVRKLLL 

F I CGS KE K YRQLRDLHTLDS \ H VRG I KKLLE EQG I FLRAS WTA 

SPQSALQYDTLISLMEHLKACAEIAAQRTINWQKFCIKDDSVLY 

FLLQVSFLVDEGVS PVLLQLLS CALCGS KVLRALAAS SGSSSAS 

SSPAPVAASSGQATTQSKSSTKKSKKEEKEKEKDGETSGSQEDQ 

LCTALVNQLNKFADKETLIQFLRCFLLESNSSSVRWQAHCLTLH 

IYRNSSKSQQELLLDLMWSIWPELPAYGRKAAQFVDLLGYFSLK 

TPQTEKKLKEYSQKAVEILRTQNHILTNHPNSNIYNTLSGLVEF 

DGYYljESDPCLVCNNPEVPFCYIKLSSIKVDTRYTTTQQVVKLI 

GS HT I S KVTVKIGDLKRTKMVRT INL Y YNNRTVQAI VEL KNKPA 

RWHKAKKVQLTPGQTEVKIDLPLPIVASNLMIEFADFYENYQAS 

TETLQCPRCSASVPANPGVCGNCGENVYQCHKCRSINYDEKDPF 
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Amino acid segment containing signal peptide 
(A^Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LCNACGFCKYARFDFMLYAKPCCAVDPI ENEEDRKKAVSNINTL 
LDKADR VYHQLMGHR PQ LENLLC KVNEAAPE KPQ DDS GTAGG I S 
S TS ASVNR Y I LQLAQE YCGDCKNS FDELSKI IQKVFASRKELLE 
YDLCX2RBAATKSSRTSVQPTFTASQYRALSVLGCGHTSSTKCYG 
CASAVTEHCITLLRALATNPALRHILVSQGLIRELFDYNLRRGA 
AAMREEVRQLMCLLTRDNPEATQQMNDLI IGKVSTALKGHWANP 
DLASSLQYEMLLLTDSISKEDSCWELRLRCALSLFLMAVNIKTP 
WVENITLMCLRILQKL I KP PAPTS KKNXDVPVEALTT VKPYCN 
E I HAQAQLWLKRD P KAS YDAWKKCLP I RG I DGNGKAP S KS ELRH 
LYLTEKYVWRWKQFLSRRGKRTSPLDLKLGHNNWLRQVLFTPAT 
QAARQAACT I VEALAT I PSRKQQVLDLLTS YLDELS I AGE CAAE 
YLALYQKLITSAHWKVYLAARGVLPYVGNLITKEIARLLALEEA 
TLS TDLQQG YALKS LTG LLSS FVEVES I KRHFKSRLVGTVLNGY 
LCLRKLWQRTKLIDETQDMLLEMLEDMTTGTESETKAFMAVCI 
ETAKRYNLDDYRTPVFI FERLCS I I YPEENEVTEFFVTLEKDPQ 
Q E D FLQGRM P GNPYS SNE PG I G PLMRD I KNKI CQDCDLVALLED 
DSGMELLVNNKIISLDLPVAEVYKKVWCTTNEGEPMRIVYRMRG 
LLGDATEEFIESLDSTTDEEEDEEEVYKMAGVMAQCGGLECMLN 
RLAGIRDFKQGRHLLTVLLKLFSYCVKVKVNRQQLVKLEMNTLN 
VMLGTLNLALVAEQE SKDSGGAAVAEQVLS I ME I \ IQAE PNVEP 
LSEDKGNLLLTGDKDQLVMLliDQINSTFVRSNPSVLQGLLRIIP 
YLSFGEVEKMQILVERFKPYCNFDKYDEDHSGDDKVFL\DCFCK 
IAAGIK\NNSNGHQL\KDL\ILQKGITQNALD\YMKKHIP/SAA 
R I WDAD I \ WKS F CLR PALP F I LRLLRGLA I QH PGTQVL I GTDS I 
PNLHKLEQVS\SDEGIGTLA\ENL\LESLREHPDVNKKIDA\AR 
RETRAE KKRMAMAMRQKALG TLG \ MTTNEKGQ WD / TOTALLE A 
DWEELIEEP\GLTCCICREGYKFQPTKVLGIYTFTKRWLGGVW 
ENKPRETSRATS TVSH FNI VH YDC \HLA\ AVS LARGREE WE S AA 
LQNANTKCNGLLPVWGPHVPESAFATCLARHNTYLQECTGQREP 
TYQLNIHDIKLLFLRFAMEQS FSADTGGGGRESNIHLI PYI IHT 
GL YVLNTTRATS RE EKNLQG FLEQ PKEKWVE S AFE VDGP YY FTV 
LALH I L P PEQ WRATRVE I LRRLLVTS QARAVAPGGATRLTDKAV 
KD YS AYRSSLLF WALVDL I YNM F KKVPTSNTEGGW S CS LAE Y I R 

HNDMPIYEAADKALKTFQEEFMPVETFSEFLDVAGLLSEITDPE 
SFLKDLLNSVP 


5789 


1 


2407 


LPLHAVEKTGRPGQPALKMPGKLRSDAGLESDTAMKKGETLRKQ 
TEE KE KKE KP KS DKTEE I AEEE ETVF PKAKQVKKKAE PS EVDMN 
SPKSKKAKK\KEEPSQNDISPKTKSLRKKKEPIEKKWSSKTKK 
VTKNEEPSEEEIDAPKPKKMKKEKEMNGETREKSPKLKNGFPHP 
EPDCNPSEAASEESNSEIEQEIPVEQKEG\AFSNFPISEETIKL 
LKGRGVTFLFP IQAKTFHHVYSGKDLI AQARTGTGKTFS FAI PL 
IEKLHG\ELQDRKRGRAPQVLVLAPTRELANQVSKDFSDITKKL 
S VACFYGGTP YGGQFERMRNG I D I L VGT PGRI KDH I QNGKLDLT 
KLNHWLDEVDQMLDMGFADQVEEILSVAYKKDSEDNPQTLLFS 
ATCPHWVFNYAKKYMKS TYEQVDLIGKKTQKTAI TVEHLAI KCH 
WTQRAAVI GD VI R V YS GHQGRT 1 1 FCETKKEAQE LS QNS AI KQD 
AQS LHGD I PQKQRE ITLKGFRNGS FGVLVATNVAARGLD I PE VD 
L VIQS S P P KD VE S Y I HRSGRTG RAGRTG VC I CFYQHKEE YQLVQ 
VEQKAGIKFKRIGVPSATEI I KASSKDAI RLLDSVPPTAISHFK 
QSAEKLIEEKGAVEALAAALAHISGATSVDQRSLINSNVGFVTM 
ILQCSIEMPNISYAWKELKEQLGEEIDSKVKGMVFLKGKLGVCF 
DVPTASVTEIQEKWHDSRRWQLSVATEQPELEGPREGYGGFRGQ 
REGSRGFRGQRDGNRRFRGQREGSRGPRGQRSGGGNKSNRSQNK 
GQKRS FS KAFGQ 


5790 


3786 


1585 


ARRQRDP LQALRRRNQELKQQ VDSLLSE S Q L KEALE PNKRQHI Y 
QRCIQLKQAIDENKNALQKLSKADESAPVANYNQRKEEEHTLLD 
KLTQQLQGLAVTISRENITEVGAPTEEEEESESEDSEDSGGEEE 
DAEEEEEEKEENESHKWSTGEEYIAVGDFTAQQVGDLTFKKGEI 
LL V I EKKPDGWW IAKDAKGNEGLVPRTYLE P YS E EEEGQE S S EE 
GSEEDVEAVDETADGAEVK\QRTDPHWSAVQKAI SEAG I FCLVN 
HVS FCYL I VLMRNRMETVEDTNGS E TG FRAWNVQSRGR I FLVS K 
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Amino acid segment containing signal peptide 
(A=*Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RoArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\s=possible nucleotide insertion) 








PVLQQINTVDVLTTMGAIPAGFRPSTLSQLLEEGNQFRANYFLQ 
PELMPSQLAFRDLMWDATEGTIRSRPSRISLILTLWSCKMIPLP 
GMS IQVLSRHVRLCLFDGNKVLSNIHTVRATWQP KKPKTWTFS P 
QVTRILPCLLDGDCFIRSNSASPDLGILFELGISYIRNSTGERG 
ELS CGWVFLKLFDASGVPI PAKTYELFLNGGTP YEKG I EVDP S I 
S RRAHGS VF YQ IMTMRRQPQLL VKLRS LNRRS RNVLS LLPETLI 
GNMCSIHLLI FYRQI LGD VLLKDRMSLQ S TD L ISH PMLATFPML 
LEQPDVMDALRSSWAGQES\TLKRSEKR\PKEFLKVPRFLLVYH 
\G CVLPLL/HTPTRLP P FRWAE EETETARWKVITD FLKQNQENQ 
GALQALLS PDGVHE P FDLSEQTYDFLGEMRKNAV 


5791 


3 


163£ 


LRVAE FAGTS R/ IGAGLIQPLHRAPARDHGLLRGGAAPALS VSH 
GN/GKQL/AMSSQGSDDEQIKRENIRSLTMSGHVGFESLPDQLV 
nrsiqogfcfnilcvgetgigkstlidtlfntnfedyesshfrp 
NVKLKAQTYELQESNVQLKLTIVNTVGFGDQINKEESYQPIVDY 
I DAQFEAYLQEELKI KRSLFT YHDS RIHVCLYFI S PTGHS LKTL 
DLLTMKNLDS KVYI I P VI AKADTVS KTELQKFKI KLMS ELVSNG 
VQ I YQFPT DDDTI AKVNAAMNGQL P FAWGS MDEVKVGNKMVKA 
RQY P WGWQ VENENHCD F VKLREML I CTNMEDLREQTHTRH YEL 
YRRCKLEEMG FTD VG PENKP VS VQET YEAKRHEFHGERQRKEEE 
MKQM FVQR VKE KEAI L KEAERE LQAKFEHLKRLHQEERMKLEE K 
RRLLEEE 1 1 AFS KKKATS E I FHSQS FLATGSNLRKDKDRKNSQF 
FVKQKVPEHRRS S SQANF I KKKLEVCFDFAVI CFITS I FGEQPQ 
LLIFMEKYFQVQGQYISQSE 


5792 


2263 


653 


AAAAPSPAWWCGVFWYWHTCWVMYGIVYTRPCSGDASCIQPY 
LARR P KLQL \ RHS FTTTRS HLG AENN I DLVLNVE DFDVES KFER 
TVNVS VPKKTRNNGTL YAY I FLHHAGVL.PWWDG KO\7HT,V<3 PT ,TT 
YMVPKPEE INLLTGE SDTQQIEADKKPTSALDE PVSHWRPRLAL 
NVMADNFVFDGS S LP AD VHR YMKM I QLGKTVH YL PILFIDQ LSN 
RVKDLMVI NRS TTEL P LTVS YDKVS LGRtiR FW IHMQDAVYS LQQ 
FGFSE KDADEVKG I FVDTNLYFLALTFFVAAFHLLFD FLAFKlSm 
IS FWKJCKKS M IGMSTKAVLWRCFS TWI FLFLLDEQTSLLVLVP 
AGVGAAI ELWKVKKALKMT I FWRGLMPEFQFGTYSESERKTEE Y 
DTQAMKYLSYLLYPLCVGGAVYSLLNIKYKSWYSWLINSFVNGV 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 
I ITMPTSHRLACFRDD WFLVYLYQRWLYPVDKRRVNEFGES YE 
EKATRAPHTD 


5793 


2263 


653 


AAAAPS PAWWCGVFWYWHTCWVMYGI VYTRPCSGDASCIQP Y 
IoARRPKIiQLNRHSFTTTRSHLGAENNIDLVLNVEDFDVESKFER 
TVNVS VP KKTRNNGTLYAY I FLHHAG VL P WHDG KQVHLVS PLTT 
YMVPKPEEINLLTGESDTQQIEADKKPTSALDEPVSHWRPRLAL 

nvmadnfvfix;sslpadvhrymkmiqlgk^^ 

rvkdlmvi nrs ttelpltvs yd kvs lgrlrfw i hmqdavys lqq 

fgfsekdadevkgifvdtnlyfi.altffvaafhllfdflafknd 

isfwkkkksmigmstkavlwrcfstwiflflldeqtsllvlvp 

agvgaaielwkvkkalkmtifwrglmpefqfgtyseserkteey 

dtqamkyls yll yplcvggavysllnikyks wys wl insfvngv 

yafgflfmlpqlfvnyklksvahlpwkaftykafntfiddvfaf 

iitmptshrlacfrddwflvylyqrwlypvdkrrvnefgesye 

ekatraphtd 


5794 


1 


5016 


mgprlsvwllllpaalllheehsraaakggcagsgcgkcdchgv " 

kgqkgerglpglqgvigfpgmqgpegpqgppgqkgdtgepglpg 

tkgtrgp pgasgypgnpglpg i pgqdgp pgp pgi pgcngtkger 

gplgppglpgfagnpgppglpgmkgdpgeilghvpgmllkgerg 

fpgipgtpgppglpglqgpvgppgftgppgppgppgppgekgqm 

glsfqgpkgdkgdqgvsgppgvpgqaqvqekgdfatkgekgqkg 

epgfqgmpgvgekgepgkpgprgkpgkdgdkgekgspgfpgepg 

ypgligrqgp\qgekgeagppgppgivigtgplgekgergypgt 

pgprgepgpkgfpglpgqpgppglpvpgqagapgfpgergekgd 

rgfpgtslpgpsgrdglpgppgspgppgqpgytngivecqpgpp 

gdqgppgi pgqpgfigeigekgqkgesclicdidgyrgppgpqg 

ppgeigfpgqpgakgdrglpgrdgvagvpgpqgtpgligqpgak 1 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D^Aspartic Acid, E=* 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lyoine, 
L=Leucine, M«Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S«Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEPGEFYFDLRLKGDKGDPGFPGQPGMPGRAGSPGRDGHPGLPG 
PKGSPGSVGLKGERGPPGGVGFPGSRGDTGPPGPPGYGPAGPIG 
DKGQAGFPGGPGSPGLPGPKGEPGKIVPIiPGPPGAEGLPGSPGF 
PG PQGDRG FPGT PGR \ PGL\PGEKGAVG\QPG I GFPGP PGP KGV 
DGLPGDMGPPGTPGRPGFNGLPGNPGVQGQKGEPGVGLPGLKGL 
PG L PG I PGTPGEKGS I G VPGVPGEHGAIG PPGLQG I RGE PG P PG 
LPGSVGSPGVPGIGPPGARGPPGGQGPPGLSGPPGIKGEKGFPG 
FPGLDMPGPKGDKGAQGLPGITGQSGLPGLPGQQGAPGIPGFPG 
SKGEMGVMGTPGQPGSPGPWGAPGLPGEKGD\HGFPGSSGPRGD 
PGLKGDKGDVGLPGKPGSMDKVYMGSMKGQKGDQGEKGQIGPIG 
EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 
GPKGSVGGMGLPGTPGEKGVPGIPGPQGSPGLPGDKGAKGEKGQ 
AGP PG I G I PG LRGE KGDQG I AGF PGS PGE KGE KGS IG I PGMPG S 
PGLKGSPGSVGYPGSPGLPGEKGDKGLPGLDGIPGVKGEAGLPG 
TPGPTGPAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGFPGAKG 
DKGSKGEVGFPGLAGSPGIPGSKGEQGFMGPPGPQGQPGLPGSP 
GHATEGPKGDRGPQGQPGLPGLPGPMGPPGLPGIDGVKGDKGNP 
GWPGAPGVPGPKGDPGFQGMPGIGGSPGITGSKGDMGPPGVPGF 
QGPKGLPGLQGIKGDQGDQGVPGAKGLPGPPGPPGPYDIIKGEP 
GLPGPEGPPGLKGLQGLPGPKGQQGVTGLVGIPGPPGIPGFDGA 
PGQKGEMGPAGPTGPRGFPGPPGPDGIjPGSMGPPGTPSVDHGFL 
VTRHSQTIDDPQCPSGTKILYHGYSLLYVQGNERAHGQDLGTAG 
S C LRKFS TM P FLFCNI NNVCNFASRND YS YWLS TPE P MPMS MAP 
ITGENIRPFISRCAVCEAPAM;mAVHSQTIQIPPCPSGWSSLWI 
GYSFVMHTSAGAEGSGQALASPGSCLEEFRSAPFIECHGRGTCN 
YYANAYSFWLATIERSEMFKXPTPSTLKAGELRTHVSRCQVCMR 
RT 


5795 


1192 


61 


STRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHE 
PLAKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVL 
VADFLEQNYDT I FEDYEKLLQS ENYVTKRQSLKLLGELI LDRHN 
FA I MTKY I S KPENLKLMMNLLRDKS PNI QFEAFHVFKVFVAS PH 
KTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQI 
RDLKKTAP + RALRDSKR 


5796 


2 


1078 


GRVGWELWCMYISPPKDWWDAGDPSLPIRTPAMIGCSFWNRKF 
FGEIGLLDPGMDVYGGENIELGIKVWLCGGSMEVLPCSRVAHIE 
RKKKPYNSNIGFYTKRNALRVAEVWMDDYKSHVYIAWNLPLENP 
G I D I GDVSE RRALRKS LKCKNFQWYLDHVYPEMRR YNNTVAYGE 
LRNNKAKDVCLDQGPLENHTAI LYPCHGWG PQLARYTKEGFLHL 
GALGTTTLL P DTRCLVDNS KS RLPQLLD CDKVKS S L YKRWNFI Q 
NGAIMNKGTGRCLEVENRGLAG IDL I LRS CTGQRWTI KNS I K* R 
EGAGALE PGPQDMAAPPNI WTSCPGGETARGRQVLDGP PRASPG 
QHRDPG 


57$7 


2 


891 


PRVRQKTLVDVTLENSNIKDQIRNLQQTYEASMDKLREKQRQLE 
VAQVENQLLKMKVESSQEANAEVMREMTKKLYSQYEEKLQEEQR 
KHS AE KEALLEETNS FLKAI EEANKKMQAAE I S LEEKDQR I GEL 
DRLIERMEKERHQLQLQLLEHBTEMSGELTDSDKERYQQLEEAS 
AS LRERI RHLNDMVHCQQKKVKQMVE E I E S LKKKLQQKQ LL I LQ 
LLEKI S FLEGENNELQSRLDYLTETQAKTE VETREIGVGCDLLP 
SQTGRTREIVMPSRNYTPYTRVLELTMKKTLT 


5798 


644 


115 


KILGSRWKSMSNQEKQPYYEEQARLSKIHLEKYPNYKYKPRPKR 
TCIVDGKKLRIGEYKQLMRSRRQEMRQFFTVGQQPQIPITTGTG 
WYPGAITMATTTPSPQMTSDCSSTSASPEPSLPVIQSTYGMKT 
DGGSLAGNEMINGEDEMEMYDDYEDDPKSDYSSENEAPEAVSAN 


5799 


2^79 


1435 


LLS T Y I KFI NL FP ETKAT I QGVLRAGS QLRNAD VE LQQRAVE YL 
TLSSVASTDVLATVLEBMPPFPERESS ILAKLKRKKGPGAGSAL 
DDGRRDPSSNDINGGMEPTPSTVSTPSPSADLLGLRAAPPPAAP 
PAS AGAGNLLVDVFDG PAAQ PS LG P T PE E AFLS PGPEDIGPPIP 
EADELLNKFVCKNNGVLFENQLLQIGVKSEFRQNLGRMYLFYGN 
KTSVQFQNFSPTWHPGDLQTQLAVQTKRVAAQVDGGAQVQQVL 
N I E CLRDFLT P PLLS VR FR YGG APQALTL KL P VT I NKFFQPTEM 
AAQDFFQRWKQLSLPQQEAQKIFKANHPMDAEVTKAKLLGFGSA 
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SEQ 
ID 
NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nuc J. CO L. J. Uc 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
;A-ftianine, L=Lysceine, u=Aspartic Acicl, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLDNVDPNPENFVGAG I IQTKALQVGCLLRLEPNAQAQMYRLTL 
RTSKEPVSRHLCELLAQQF 


5800 


2679 




1435 


LLSTYIKFINLFPETKATIQGVLRAGSQLRNADVELQQRAVEYL 
TLSSVASTDVLATVLEEMPPFPERES^ILAKLKRKKGPGAGSAL 
DDGRRDPSSNDINGGMEPTPSTVSTPSPSADLLGLRAAPPPAAP 
PASAGAGNLLVDVFDG PAAQ PS LG PTPEEAFLS PGPED I GP P I P 
EADELLNKFVCKNNGVLFENQLLQIGVKSEFRQNLGRMYLFYGN 
KTSVQFQNFSPTWHPGDLQTQLAVQTKRVAAQVDGGAQVQQVL 
N I E CLRDFLTP PLLS VRFRYGGAPQALTLKLP VT INKF FQ PTEM 
AAQDFFQRWKQLSLPQQEAQKI FKANHPMDAEVTKAKLLGFGSA 
LliDNVDPNPENFVGAGI IQTKALQVGCLLRLEPNAQAQMYRLTL 
RTSKEPVSRHLCELLAQQF 


SHOT 


j 


X4XJ 


FPRLYHblPDGblTSIKINRVDPSESLSIRLVGGSETPLVHIII 
QHIYRDGVIARDGRLLPGDIILKVNGMDISNVPHNYAVRLLRQP 
CQVLWLTVMREQKFRSRNNGQAPDAYRPRDDSFHVILNKSSPEE 
ylAjl l\LivKl\.VL/liFOVr 1 rnVJjUGwVAYKnGQLEENDRVIiAINGH 
DLRYGS PES AAHL I QAS ERRVHLWS RQVRQRS PDI FQEAGWNS 
NGSWSPGPGERStrrPKPLHPTITCHEKVVNIQKDPGBSLGMTVA 
GGASHREWDLPIYVISVEPGGVISRDGRIKTGDILLNVDGVELT 
EVSRSEAVALLKRTSSSIVLKALEVKEYEPQEDCSSPAALDSNH 
NMAPPSDWSPSWVMWLELPRCLYNCKDIVLRRNTAGSLGFCIVG 
G YE E YNGNKP FFI KS I VEGT PA YNDGR I RCGD TLLAVNGRS TS G 
M I HACLARLL KELKGR I TLT I VS W PGTFL 








obxQiMbKlMDIjPTLIiRnAFREMFSVGGLF WMFRIRIILCLM 
GAFFYLISPLDFVPEALFGILGFLDDFFVIFLLLIYISIMYREV 
ITQRLTR 


5803 


2234 


1299 


EAQFGTTAEIYAYREEQDFGIEIVKVKAIGRQRFKVLELRTQSD 
G IQQ AKVQ I L P E CVL P S TMS AVQLE S LNKCQ I F P S KP VS REDQ C 
SYKWWQKYQKRKFHCANLTSWPRWLYSLYDAETLMDRIKKQLRE 
WDENLKDDSLPSNPIDFSYRVAACLPIDDVLRIQLLKIGSAIQR 
LRCEIiDIMNKCTSLCCKQCQETEITTKNEIFSLSLCGPMAAYVN 
PHGYVHETLTVYKACNLNLIGRPSTEHSWFPGYAWTVAQCKICA 
SHIGWKFTATKKDMS PQKFWGLTRS ALLPTI PDTEDE I S PDKVI 
LCL ' 


5804 


2 


1707 


EME KQRQEEQRKRTEEERKRR I EQDMLEKRKI QRELAKRAEQI E 
DINNTGTESASEEGDDSLLITWPVKSYKTSGKMKKNFEDLEKE 
K JS b K.b H x is. x hi Xj u KR I R Y bEQRPS JUKEAKCLS LVMDDE I E S E AKK 
ESLSPGKLKLTFEELERQRQENRKKQAEEEARKRLEEEKRAFEE 
ARRQMVNEDE ENQDTAK I FKGYRPGKLKLSFEEMERQRREDEKR 
KAEEEARRRI EEEKKAFAEARRNM WDDDSPEMYKT ISQE FLTP 
GKLEINFEELLKQKMEEEKRRTEEERKHKLEMEKQEFEQLRQEM 
GEEEEENETFGLSREYEELIKLKRSGSIQAKNLKSKFEKIGQLS 

PTf P TC\YV TPPFT? &DD O t. TTM.PT VPDPZiPTOPU1?T?TVmrrKn3 Tin T3 vo 

EAP FTHKVNMKARFEQMAKAREEEEQRRI EEQKLLRMQFEQRE I 
DAALQKKREEEEEEEGSIMNGSTAEDEEQTRSGAPWFKKPLKNT 

SWD P P VR FTVK VTO P P VP P T T WW FFOP T T .OTV3P n VTiV T uup c 

TYCLYLPETFPEDGGEYMCKAVNNKGSAASTCILTIESKN 


5805 


3 


776 


YISDTLGQVYKSKIRWWIEENGGNGNISVDDLIALLDLAEHASS 
AFKESQQQS EDRE YE VKERLYPKS KRRYDTYNIAGYQGE IEVGL 
YTI QI LQLI PFFDNKNELS KRYMVNFVSGS S DI PGDPNNE YKLA 
LKNYI P YLTKLKFSLKKS FD FFDE Y FVLLKPRNN I KQNEEAKTR 
RKVAGYFKKYVDIFCLLEESQNNTGLGSKFSEPLQVERCRRNLV 
ALKADKFSGLLEYLIKSQEDAISTMKCIVNEYTFLLK 


5806 


1257 


877 


AVFTFHNHGRTANLYSLHSWLGITTVFLFACQRFLGFAVFLLPW 
ASMWLRSLLKP IHVFFGAAI LSLS IASVISGINEKLFFSLKNTT 
RPYHSLPSEAVFANSTGMLWAFGLLVLYILLASSWKRP 


5807 


2267 


1302 


RFSKKTFRRPMAVDIQPACLGLYCGKTLLFKNGSTEIYGECGVC 
PRGQRTNAQKYCQPCTESPELYDWLYLGFMAMLPLVLHWFFIEW 
YSG KKS S SAL FQH I TAL FE CS MAAI I TL L VS DP VG VLY IRS CRV 
I^LSDWYTMLYNPSPDYVTTVHCTHEAVYPLYTIVFIYYAFCLV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine/ N«=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine , V= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) [ 








LMMLLRP LLVKKI ACGLGKS DRFKS I YAAL Y F F P I LTVLQAVGG 
GLL YYA F P Y 1 1 LVLS L VTLAVYMSAS E I ENC YDLL VRKKR L I VL 
FSHWLLHAYGI IS I SRVDKLEQDLPLLALVPTPALFYLFTAKFT 
EPSRILSEGANGH 


5808 


2 


433 


S LPDSGWE YLSNGGVADNHKDFGELR YNE CLMNFS CNGKNGS S 
EGRITHGFQLKSAYENNLMPYTNYTFDFKGVIDYIFYSKTHMNV 
LGVLGPLDPQWLVENNITGCPHPHIPSDHFSLLTQLELHPPLLP 
LVNGVHLPNRR 


5809 


464 


2422 


I LVPGFQG I LHPGVYCALQSQHQAQELVAD I DECE VSGLCRHGG 
RCVNTHGS FE CYCMDGYLPRNGPEPFHPTTDATSCTE IDCGTPP 
EVP DGY I IGN YTS S LGS Q VR YACREGF FS VPEDTVS S CTGIiGTW 
ESPKLHCQE INCGNP PEMRHAI LVGNHS S RLGGVAR YVCQEG FE 

INSRRINPKI S YVI S I KGQRLD PMES VREETVNLTTDS RTPEVC 
LAL YPGTNYTVN I S TAP PRRS M PAVI G FQTAE VDLLEDDGS FN I 
S I FNETCLKLNRRSRKVGSEHM YQFTVLGQRWYLANFSHATSFN 
FTTREQVP WCLDLYPTTDYTVNVTLLRS PKRHSVQ I T IATPPA 
VKQTISNISGFNETCLRWRSIKTADMEEMYLFHIWGQRWYQKEF 
AQEMTFNI SSS SRDPEVCLDLRPGTNYNVSLRALSS BLPWI SL 
TTQITEPPLPEVEFFTVHRGPLPRLRLRKAKEKNGPISSYQVLV 
L P LALQS TF S CDS EGAS S FFS NAS DADG YVAAE LLAKDVPDDAM 
EIPIGDRLYYGEYYNAPLKRGSDYCIILRITSEWNKVRrhscaV 
WAQVKDSSLMLLQMAGVGLGSLAWI ILTFLSFSAV 


5810 


3 


1641 


KVFGTHKDHEVSTLDTAISAVKVQLAEFLENLQEKSLRIEAFVS 
EIESFFNTIEENCSKNEKJlIiEEQNEEMMKKVIiAQYDEKAQSFEE 

VTfT?T(TfMPPT»WPnM\7WPT.nQMnTaKTYrT.PTTVPP&PPT m?»T7lTT t 

SFEEINERLLSAMESTASLEKMPAAFSLFEHYDDSSARSDQMLK 
QVAVPQP PRLE PQEPNSATSTT I AVYWSMNKEDVI DS FQVYCME 
EPQDDQEVNE LVEEYRLTVKES YC I FEDLE PDRC YQVWVMAVNF 
TGCS L PSERAI FRTAP S TP VI RAEDCTVCWNTAT I RWR PTT PEA 
TETYTLEYCRQHSPEGEGLRSFSGIKGLQLKVNLQPNDNYFFYV 
RAINAFGTS EQS EAALI STRGTRFLLLRETAHPALHIS SSGTV I 
S FG ERRRLTE I P S VLG EELP S CGQH YWETTVTDCP AYRLG I CS S 
SAVQAGALGQGETSWYMHCSEPQRYTFFYSGIVSDVHVTERPAR 
VGI.LLDYNNORLT FTNAESEOLIjFT TPHPPNPnVWPAPaT .T?TfDf2 
KCTLHLGIEPPDSVRHK 


5811 


1918 


851 


AAALADPLPEDKWSAEKRRPLKSSLGYEITFSLLNPDPKSHDVY 
WD I EGAVRRYVQP FLNALGAAGNFS VDSQ I LYYAMLGVNPRFDS 
AS S S Y YLDMHS L PHV IN P VE SRLGS S AAS L YP VLNFLLYVPELA 
HS PLY I QDKDGAP VATNAFHS PRWGG I MVYNVDS KTYNASVLP V 
RVEVDMVRVMEVFLAQLRLLFGIAQPQLPPKCLLSGPTSEGLMT 
WELDRLLWARS VENLATATTTLTS LAQLLGKI SNI VI KDDVASE 
VYKAVAAVQK^AEEIiASGHIASAFVASQEAVTSSELAFFDPSLL 
HLLYFPDDQKFAIYIPLFLPMAVPILLSLVKIFLETRKSWRKPE 
KTD 


5812 


5204 


2744 


GGRQRCQRGRSCGAREEEVEPGTARPPPAASAMDASLEKIADPT 
LAEMGKNLKEAViOMLEDSQRRTEEEN'GKKLISGDIPGPLQGSGQ 
DMVS I IjQLVQNLMHGDEDEEPQS PRI QNIGEQGHMALLGHSIiGA 
Y I S TLDKEKLR KLTTR I LS DTTLWLCRI FR YENGCAYFHEEERE 
GLAKI CRIAI H S R YEDFWDG FNVLYNKKPVI YLS AAAR PGLGQ 
YLCNQLGLPFP CLCRVPCNTVFGSQHQMDVAFLEKLI KDDI ERG 
RLPLLLVANAGTAAVGHTDKIGRLKELCEQYG I WLHVEGVNLAT 
IJUX3YVSSSVLAAAKCDSMTMTPGPWLGLPAVPAVTLYKHDDPA 
LTLVAGLTSNKPTDKLPJUjPLWLSLQYLGLDGFVERIKHACQLS 
QPXQESLKKVNYIKILVEDELSSPVVVFRFFQELPGSDPVFKAV 
PVPNMTPSGVGRERHSCDALNRWIiGEQLKQLVPASGLTVMDLEA 
EGTCLRFSPLMTAAVLGTRGEDVDQLVACI ES KLP VLCCTLQLR 
EEFKQEVEATAGLLYVDDPNWSGIGVVRYEHANDDKSSLKSYPQ 
GENIHAGLLKKLNELESDLTFKIGPEYKSMKSCLYVGMASDNVH 
AAELVETIAATAREIEDNSRLLENMTEWRKGIQEAQVELQKAS 
EERLLEE GVLRQ I P WGS VLNWFS P VQALQ KGRT FNLTAGSLES 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine / T=Threonine, V=Valine, 
W-Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TEPIYVYKAQGAGVTLPPTPSGSRTKQRLPGQKPFKRSLRGSDA 
LSETSSVSHIEDLEKVERLSSGPEQITLEASSTEGHPGAPSPQH 
TDQTEAFQKGVPHPEDDHSQVEGPESLR 


5B13 


2936 




HRDGVSGSLERPLTDRSRTGAFAQQRGKMATAGGGSGADPGSRG 
LLRLLSFC^IiAGLCRGNSVERKlYIPLNKTAPCVRiLNATHQI 
GCQSSISGDTGVIHWEKEEDLQWVLTDGPNPPYMVLLESKHFT 
RDLMEKLKGRTSRIAGLAVSLTKPSPASGFSPSVQCPNDGFGVY 
SNS YGPEFAHCREIQWNSLGNGLAYEDFSFP I FliLEDENETKVI 
KQCYQDHNLSQNGSAPTFPLCAMQLFSHMAWLSFSTAT\CMRRS 
S IQSTFS INP KI VCDPLSDYNVWSMLKP INTTGTLKPDDRVWA 
ATRLDSRSFFWNV\APGAESAVAS fvtqlaaaealqkapdvttl 
P RNVM FVFFQGETFD Y I G S SRM VYDMEKGKFP VQIiENVDS FVEL 
GQVALRTSLELWMHTDPVSQKNESA^NQVEDLLATLEKSGAGVP 
AVILRRPNQSQPLPPSSLQRFLRARNISGVVLADHSGAFHNKYY 
QSIYDTAENINVSYPEWIiEPLKE/ETWNFG*QDTAKALADVATV 
LGRALYELAGGTNFSDTVQADPQTVTRLLYG\ FLIKANNSWFQS 
ILQGRDLRSYLG*RGLFQH\YIAV\SSPTNTIYV/VLQYALANL 
TGTWNLTREQCQDPSKVPSENKDLYEYSWVQGPLHSNETDRLP 
RCVRSTARIiARALSPAFELSQWSSTEYSTWTESRWKDIRARIFL 
IAS KELEL I TLTVGFG I L I FSL I VT YCINAKADVLF I APRE PGA 
VSY 


5814 


8500 


432 


ALKCRPRRVLAILVGPVQPDRMAEEGAVAVCVRVRPLNSREESL 
GETAQVYWKTHNNVIYPVDGSKSFNFDRVLHGNETPKNVYEA\I 
AAP 1 1 DS AIQG YNGTI FA\ YGQT \ ASGKTYTMMGS EDHLG VI PC; 
GQFHGHFSQKI * E VFLDRE FLLRVS YME I YNET I TDLLCGTQKM 
KPLI IREDVNRNVYVADLTEEWYTSEMALKWITKGEKSRHYGE * 
TKMNQRS SRS HT I FRM I LE S RE KGE PSNCEGS VKVS HLNL VDLA 
G SERAAQTGAAG VRLKEG CN INR S LF I LGQVI KKLS DGQVGG FI 
NYRDS KLTR I LQNS LGGNPKTR 1 1 CTI TP VS FDETLTALQFAST 
AKYMKNT P YVNE VSTDEALLKR YRKE I MDLKKQLE E VSLE TRAQ 
AMEKDQLAQLLEEKDLLQKVQNEKI ENLTRMLVTS S SLTLQQEL 
KAKR KRR VTWCLG K I NKMKNS NYADQ FN I PTN I TT KTHKLS I NL 
LREIDESVCSESDVFSNTLDTLSEIEWNPATKLLNQENIESELN 
SLRADYDNLVLDYEQLRTEKEEMELKLKEKNDLDEFEALERKTK 
KIX3EMQLIHEISNLKNLVKHREVYNQDLENELSSKVELLREKED 
Q I KKLQE YI DS QKLENI KMDLS YSLE S I EDPKQMKQTLFDAE TV 
ALDAKRESAFLRSENLELKEKMKELATTYKQMEND IQLYQSQLE 
AKKKMQVDLEKELQS AFNE ITKLTSL IDGKVPKDLLCNLELEGK 
ITDLQKELNKEVEENEALREEVILLSELKSLPSEVERLRKEIQD 
KSEELHIITSEKDKLFSEWHKESRVQGLLEEIGKTKDDLATTQ 
SNYTCSTDQEFQNFKTLiHMDFEQKYKMVLEENERMNQEIVNLSKE 
AQ KFDS S LGALKTE LS YKTQE LQE KTRE VQERLNEM EQLKE QLE 
NRDSPLQTVEREKTLI TEKLQQTLEEVKTLTQEKDDLKQLQES L 
QIERDQLKSDIHDTVNMNIDTQEQLRNALESLKQHQETINTLKS 
KI SEEVSRNIiHMEENTGETKDEFQQKMVGIDKKQDLEAKNTQTL 
TADVKDNEI IEQQRKI FSLIQEKNELQQMLES VIAEKEQLKTDL 
KENIEMTIENQEELRLLGDELKKQQEIVAQEKNHAIKKEGELSR 
TCDRLAEVEEKLKEKSQQLQEKQQQLLNVQEEMSEMQKKINEIE 
NLKNELKNKELTLBHMETERLELAQKLNENYEE VKS I TKERKVL 
KELQKS FETE RDHLRGY I RE I EATGLQTKEELKI AHI HLKEHQE 
TIDELRRSVSEKTAQIINTQDLEKSHTKLQEEIPVLHEEQELLP 
NVKKVSETQ E TMNELE LLTEQS TTKDSTTLAR I EMERLRLNE KF 
QESQEEIKSLTKERDNLKTIKEALEVKHDQLKEHIRETLAKIQE 
SQSKQEQSLNMKEKDNETTKIVSEMEQFKPKDSALLRIEIEMLG 
LSKRLQESHDEMKSVAKEKDDLQRLQEVLQSESDQLKENIKEIV 
AKHLETEEELKVAHCCLKSQEET INELRVNLSEKETE 1ST IQKQ 
LEAINDKLQNKIQEIYEKEEQLNIKQISEVQEKVNELKQFKEHR 
KAKDSALQS I ESKMLELTNRLQESQEEIQIMIKEKEEMKRVQEA 
LQI ERDQLKENTKE I VAKMKESQEKE YQFLKMTAVNETQEKMCE 
I EHLKEQFETQKLNLENI ETENI RLTQI LHENLEEMRSVTKERD 
DlaRSVEETLKVERDQLKENLRETITRDLEKQEELKIVHMHLKEH 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V»Valine, 
W=Tryptophan, Y« Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








QETIDKLRGIVSEKTNEISNMQKDLEHSNDALKAQDLKIQEELR 
IAHMHLKEQQETIDKLRGIVSEKTDKLSNMQKDLENSNAKLQEK 
IQELKANEHQLITLKKDVNETQKKVSEMEQLKKQIKDQSLTLSK 
LEIENLNLAQKLHENLEEMKSVMKERDNLRRVEETLKLERDQLK 
ESLQETKARDLEIQQELKTARMLSKEHKETVDKLREKISEKTIQ 
I S DI QKDLD KS KDELQ KKIQELQ KKELQLLRVKED VNMSHKKIN 
EMEQLKKQ FE PNY LCKCEMDNFQLTKKLHE SLEE I R I VAKERDE 
LRRIKESLKMERDQFIATLREMIARDRQNHQVKPEKRLLSDGQQ 
HLMESLREKC3RIKELLKRYSEMDDHYECLNRLSLDLEKEIEFH 
RIMKKLKYVLSYVTKIKEEQHECINKFEMDFIDEVBKQKELLIK 
IQHLQQDCDVPSRELRDLKLNQNMDLHIEEILKDFSESEFPSIK 
TEFQQVLSNRKEMTQFLEEWLNTRFDIEKLKNGIQKENDRICQV 
NNFFNNR 1 1 AIMNESTE FEERS ATIS KE WEQDLKSLKEKNEKLF 
KNYQTLKTSIiASGAQVNPTTQDNKNPHVTSRATQLTTEKlRELE 
NS LHE AKE S AMHKE SKI I KMQKELEVTNDI I AKLQAKVHESNKC 
LE KT KET I Q VLQDKVALGAKP YKE E I EDLKM KLGK I DLE KMKNA 
KE FE KE I S ATKATVE YQKE V I R LLRENLRRS QQAQDTS V I SEHT 
DPQPSNKPLTCGGGSGIVQNTKALILKSEHIRLEKEISKLKQQN 
EQL IKQKNELLSNNQHLSNEVKTWKERTLKREAHKQ VTCENS PK 
SPKVTGTASKKKQITPSQCKERNLQDPVPKESPKSCFFDSRSKS 
LPSPHPVRYFDNSSLGLCPEVQNAGAESVDSQP\GPWARLFQGK 
DVP\ECKTQ 


5815 


23 


1460 


S E LVMWTVQNRE S LGLLS F P VM I TM VCCAHSTNE PSNMS YVKET 
VDRLLKGYDI RLRPDFGGP PVDVGMR IDVAS IDMVSEVNMDYTL 
TMYFQQSWKDKRLSYSGIPLNLTLDNRVADQLWPDTYFLNDKK 
S FVHGVTVKNRM I RLHPDGTVLYGLR ITTTAACMMDLRRYPLDE 
QNCTLE I E S YG YTTDD I E F YWNGGEGAVTG VNK I ELP Q FS I VD Y 
KMVS KKVE FTTGAYPR LS LS FRLKRNIG YF I LQT YMP S TL I T IL 
S WVS FWINYDASAARVALG I TTVLTMTTI STHLRETL P KI P YVK 
AIDIYLMGCFVFVFLALLEYAFVNYIFFGKGPQKKGASKQDQSA 
NEKNKLEMNKVQVDAHGNILLSTLEIRNETSGSEVLTSVSDPKA 
TMYSYDSASIQYRKPLSSRE\A*GRAPDRHGVPSKGRIRRRAS\ 
QLKVKI PDLTDVNS IDKWSRM FFP I TFSLFNWYWLYYVH 


\ 5816 


861 


191 


TSSRSRAAAQEGDAETPGSVERRGRRAGAEDGMSQAPGAQPSPP 
TVYHERQRLELCAVHALNNVLQQQLFSQEAADEICKRLAPDSRL 
NPHRS LLGTGNYD VNVI MAALQGLGLAAVWWDRRR P LSQIiAL P Q 
VLGLI LN LPS P VS LGLLS LPLRRRHLRW PCARL / VT VS Y YNLDS 
K\LRAPEGPGGLRTE\ *GPFLAAALAQGLCEVLL WTKEVEEKG 
SWLRTD 


5817 


851 


118 


RLFRGPGANRGRSCRGCSGGREPSGGALPKRHCPC*PPSPPAAD " 

VMSNTTVPNAPQANS DSM VC3YVLG P F FL I TLVGVWAWM YVQK 

KKRVDRLRHHLLPMYSYDPAEELHEAEQELLSDMGDPKW\QAG 

RVATSTSGCHCWMSRRDLTPLPHPSEPGVLDCLGPCHLLPLLSP 

GSPCWVLGLHFSLHPPSAASASHALTITSLPPGLLPFVGVELTA 

HPQALMGRGFPSGMAAAGRHLCFL 


5818 


3 


3918 


QALRDKL WIFLVQS F YAVRHTES WKLMSTDDQQKIQAAAFDKCjD " 
DRRLGKKPIFSSSQQRKQVSDSGDIKIKSWRGNNKKECWSYLST 
NKKMKSDGLGASGHSSSTNRNSINKTLKQDDVKEKDGTKIASKI 
TKELKTGGKNVSGKPKTVTKS KTENGDKARLENMS PRQWERSA 
TAAAAATGQKNLLNGKGVRNQEGQI SGARPKVLTGNLNVQAKAK 
PLKKATGKDSPCLS IAGPSSRSTDSSMEFS ISTECLDEPKENGS 
TEEEKPSGHKLS FCDS PGQMMKNSVDS VKNSTVAI KSRP VSRVT 
NGTSNKKSIHEQDTNVNNSVLKKVSGKGCSEPVPQAILKKRGTS 
NGCTAAQQRTKSTPSNLTKTQGSQGES PNS VKSS VS SRQSDENV 
AKLDHNTTTEKQAPKRKMVKQVHTALPKWAKIVAMPKNLNQSK 
KGETLNNKDSKQKMPPGQVISKTQPSSQRPLKHETSTVQKSMFH 
DVRDNNNKDSVSEQKPHKPLINLASEISDAEALQSSCRP\DPQK 
PLNDQEKEKLALE CQNIS KLDKS LKHELES KQ I CLDKSETKFPN 
HKE TDDCDAANI CCHS VG SDNVNS KFYS TTAL KYMVS NPNENS L 
NSNPVCDLDSTSAGQIHLISDRBNQVGRKDTNKQSS I KCVEDVS 
LCNPERTNGTLNSAQEDKKS KVP VEGLT IPS KLS DE S AMDE DKH 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine f D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ATADSDVSSKCFSGQLSEKNSPKNMETSESPESHETPETPFVGH 
WNLSTGVLHQRES PESDTGSATTSSDDI KPRSEDYDAGGSQDDD 
GSNDRGISKCGTMLCHDFIjGRSSSDTSTPEELKIYDSNLRIEVK 
MKKQSSNDLFQVNSTSDDEIPRKRPEIWSRSAIVHSRERENIPR 
GSVQFAQEIDQVSSSADETEDERSEAENVAENFSISNPAPQQFQ 
GIINLAFEDATENECREFSANKKFKRSVLLSVDECEELGSDEGE 
VHTPFQASVDSFSPSDVFDGISHEHHGRTCYSRFSRESEDNILE 
CKQNKGNSVCKNESTVLDLSSIDSSRKNKQSVSATEKKNTIDVL 
SSRSRQLLREDKKVNNGSNVENDIQQRSKFLDSDVKSQERPCHL 
DLHQRE PNS D I PKNS ST KS LDS FRS QVLPQEG PVKE S HS TTTE K 
AN I ALS AGD I DDCDTLAQTRM YDHR P S KTL S P I YEMD V I EAFEQ 
KVE S ETHVTDMDF * DDQH FAKQDWTLLKQLLS EQD SNLDVTNS V 
PEDLSLAQYLINQTLLliARDSSKPQGITHIDTLNRWSELTSPLD 
S SAS I TMAS FS S EDCS PQGEWT I LELETQH 


5819 


l 


5557 


AAAGLLGALHLVMTLWAAARAEKEAFVQSES 1 1 EVLRFDDGGL 

LQTETTLG LS S YQQKS I S L YRGNCR P I RFE P PMLDFHEQ P VGM P 

KMEKVYLHNPSS E*TI TLVSI FATTSHFHAS FFQNRKILPGGNT 

S FDVS / VF LAR WGNVENTLF INTS NHGV FTY \ Q VFGVG VPNP Y 

RLR P F LiGARVTVNSS F S P 1 1 N I HNPHS E PLQ VVEM YS SGG DLHL 

ELPTGQQGGTRKLWEIPPYETKGVMRASFSSREADNHTAFIRIK 

TNAS DS TE FI I L P VEVE VTTAPG I YS S TEMLD FGTLRTQDLPKV 

LNLHLLNSGTKDVPITSVRPTPQ\NDAITVHFKPITIjKAS\ESK 

YTKVAS I S FBAS KAKKPS QFSGKI TVKAKE KS YS KLE I P YQAEV 

LDG YLGFDKAATLFHI RDS PADPVERP I YLTNTFS FAI LIHDVL 

LPEEAKTMFKVHNFSKP VL I LPNESGY I FTLLFMPSTSSMHI DN 

N I LL I TNASKFHLP VRVYTG FLDYFVLPPKI EERFIDFGVLSAT 

EASNILFAIINSNPIELAIKSWHIIGDG\LSIELVAVDRGNRTT 

IISSLPECEKSSSSDQSSVTLASGYF\AVFRVKLTAKKL\EGIH 

DGAIQITTDYEILTIPVK\AVIAVGSLTCSPKHWLPPSFPGKI 

VHQSLNIMNSFSQKVKIQQIRSLSEDVRFYYKRLRGNKEDLEPG 

KKSKIANIYFDPGLQCGDHCYVGLPFLSKSEPKVQPGVAMQEDM 

WDADWDLHQSLFKGWTG I KENSGHRLSAI FEVNTDLQKNI I SKI 

TAELS W P S I LS S PRHLKF PLTNTNCS S \ E BE I TLENP / S Q0VP V 

YVQ F I P LAL YSNPS VFVDKL VS RFNLS KVAK I DLRTLEFQ VFRN 

SAHPLQSSTGFMEG\LSPHLILNLILKPGEKKSVKVK\FTPVHN 

RTVSSL 1 1 VRNNLTVMDAVMVQGQGTTENLRVAGKLPGPGS SLR 

FKITEALLKDCTDSLKLREPNFTLKRTFKVENTGQLQIHIETIE 

ISGYSCEGYGFKWNOQEFTLSANASRDIIILFTPDFTASRVIR 

ELKFITTSGSEFVFILNASLPY^MIATCAEALPRPNWELALYII 

ISGIMSALFLLVIGTA\YLEAQGIWEP\FRRRLS\FEASNPPFD 

VGRPFDLRRIVGISSEGNLNTLSCDPGHSRGFCGAGGSSSRPSA 

GSHKQ * GPS GHPHS SHSNRNSADVDDVRAYNSGRTS SMTSAQAA 

S SQPANKTRPLVLDSNTGAQGHS AGRKS KGAKQSQHGSQHHAHS 

PLEQHPQPPLPPPVPQPQEPQPERLSPAPLAHPSHPERASSARH 

SSEDSDITSLIEAMDKDFDHHDSPALEVFTEQPPSPLPKSKGKG 

KPLQRKVKPPKKQEEKEKKGKGKPQEDELKDSLADDDSSSTTTE 

TSNPDTEPLLKEDTEKQKGKQAMPEKHESEMSQVKQKSKKLIjNI 

KKEIPTDVKPSSLELPYTPPLESKQRRNLPSKIPLPTAMTSGSK . 

SRNAQKTKGTSKLVDNRPPALAKFLPNSQELGNTSSSEGEKDSP 

PPEWDS VPVHKPGSSTDSLYKLSLQTLNAD I FLKQRQTS PTPAS 

PSPPAAPCPFVARGSYSSIVNSSSSSDPKIKQPNGSKHKLTKAA 

S L PGKNGNPTFAAVTAG YDKS PGGNGFAKVS SNKTG FSSSLGIS 

HAPVDSDGSDSSGLWSPVSNPSSPDFTPLNSFSAFGNSFNLTGE 

VFSKLGLSRSCNQASQRSWNEFNSGPSYLWESPATDPSPSWPAS 

SGSPTHTATSVLGNTSGLWSTTPFSSSIWSSNLSSALPFTTPAN 

TLAS IGLMGTENS PAP HAPSTS S P ADDLGQTYN PWR I WS PTIGR 

RSS DPWSNSHFPHEN 


5820 


310 


1270 


RVS LSGP VSLGVLLCARSSTMGKRDNRVAYMNP IAMARSRGP IQ 
SSGPTIQ\VI*IDQGLPGKK*KSN*KRKRK/DSKALAEFEEKMN 
ENWKKELEKHREKLLSGSESSSKKRQRKKKEKICKSW* \DSSSS \ 
SSSSDSSSSSSDSEDEDKKQGKRRKKKKNRSHKSSESSMSETES 
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Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H=Histidine * Tm Tflolpnri n** K-Tivqi t\** 

« ^ 0 a a c? \j -i- c L* \* ~L lie / X\ M ^yQ<LllC / 

L-Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine, T threonine, V= Valine, 
WsTryptophan, Y«Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DSKDSLKKKKKSKDGTEKEKDIKGLSKKRKMYSEDKPLSSESLS 
ES E Y I EEVRAKKKKS SEEREKATEKTKKKKKHKKHS KKKKKKAA 
SSSPDSP*H*EKSGFPYKESAMSEEISTVKTTTYLLKCMNFLVF 
GIIPGLFSSHSDATV 


5821 


179 


915 


KWRNQ S WR W P KPGTN WMLS CS VCWRRVTWTGS VWMRKLG KHPQT 
PT/l KD CS I AATGKRPSARFPHQRRKKRREMDDGLAEGGPQRSN 
TYVIKLFDRSVDLAQFSENTPLYPICRAWMRNSPSVRERECSPS 
SPLPPLPEDEEG\SEVTNSKSR*CVQACPPTHTPGGQPKNACR\ 
SRIPSPLAALRMQGTP*RWSPFEPEPSPSTLIYRNMQRWKRIRQ 
RWKEASHRNQLRYSESMKILREMYERQ 


5822 


464 


4379 


QTLKEMPIVMARDLEETASSSEDEEVISQEDHPCIMWTGGCRRI " 
PVLVFHADAI LTKDNNI RVIGERYHLS YKI VRTDSRLVRS I LTA 
HG FHE VHPSS TD YNLMWTG SHLKP FLLRTLSEAQKVNH F PRS YE 
LTRKDRL YKN 1 1 RMQHTHGFKAFH I LPQTFLLPAE YAE FCNS YS 
KDRG P W I VKP VAS SRGRG\ VYLINNPNQ I S LEEN I LVS R Y I NNP 
LL I DDFKFDVRL YVLVTS YDPLVI YLYEEGLARFATVRYDQGAK 
NI RNQ FMHLTN YS VNKKS GD Y VS CDD P E VED YGNKWSMS AMLR Y 
LKQEGRDTTALMAHVE33LI I KTI I SAEIiAIATACKTFVPHRSSC 
FEL YG FD VL I D S TL KPWLLE VNLS PS LACDAP LDLKI KAS M I S D 
MFTWG FVCQD P AQRAS TRP I YPTFE S S RRNP FQ KPQRCRPLS A 
SDAEMKNLVGSAREKGPGKLGGSVLGLSMEEIKVLRRVKEENDR 
RGGFIRIFPTSETWEIYGSYLEHKTSMNYMLATRLFQDRMTADG 
APELK I * S LNS KAKLHAAL YERKLLS LEVRKRRRRS S RLRAMRP 
KYPVITQPAEMNVKTETESEEEEEVALDNEDEEQEASQEESAGF 
LRENQAKYTPSLTALVENTPKENSMKVREWNNKGGHCCKLETQE 
LEP K FNLMQ I LQDNGNLS KMQARIAFSAYLQHVQ I \RLMKDSGG 
QTFSASWAAKEDEQMELVVRFLKRASNNLQHSLRMVLPSRRLAL 
LERTRILAHQLGDFIIVYNKETEQMAEKKSKKKVEEEEEDGVNM 
ENFQE F I RQASEAELEEVLTF YTQKNKS AS VFLGTHS KIS KNNN 

E KEAKLVYS NS S SG PTATLQK I PNTHL S S VTTS DLS PG P CHHS S 
LSQIPSAIPSMPHQPTILLNTVSASASPCLHPGAQNIPSPTGLP 
RCRSGSHTIGPFSSFQSAAHIYSQKLSRPSSAKAGSCYLNKHHS 
GIAKTQKEGEDASLYSKRYNQSMVTAEIjQRLAEKQAARQYSPSS 
H INL LTQQVTNLNLATG I INRS S ASAP PTLRP 1 1 S PSGPTWSTQ 
SDPQAPENHSSSPGSRSLQTGGFAWEGEVENNVYSQATGWPQH 
KYHPTAGSYQLQFALQQLEQQKLQSRQLLDQSRARHQAIFGSQT 
LPNSNLWTMNNGAGCRISSATASGQKPTTLPQKWPPPSSCASL 
VP KPP PNHEQ VLRRATSQKAS KGS SAEGQLNGLQS SLNPAAF VP 
I TS STDPAHTK IMNHKHTEKQPVHHS WVHD 


5823 


42 


2293 


LLTALSMEGGGGRDEPS ACRAGD VNMDD P KKED I LLLADE KFD F 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TSESPFAWSPLAGEKFVEVYKEAHLLALHIESSSRNQAAQAAKP 
EDPRSQGVERFIQESKF\KINLFEKEKEMKKSPTSLKRETYYLS 
DS PLIiGP PVGEPRLLASS PALPS SGAQARLTRAPGPPHSAHALP 
RESCTAHAASQAATQRKPGTKLLLPRAASVRGRGI PGAAEKPKK 
EIPASPSRTKIPAEKESHRDVLPDKPAPGAVNVPAAGSHLGQGK 
ka i v v \ jn iUj^ixKATJjijKAPGS I SN \LQR KS S S GA\VWSGASSA 
CTPQPVAKAKS SEFAS I PAN * LPGLCPNI SKS \GRMGPAMLRPA 
L\ PAGPVG \AS S WQAKRVDVS ELAAEQLTAP P \ S AS PTQPQTPE 
GGG\QWLNS SCAWSESSQLNKTRS I RRRDSCLNS KTKVMPTPTN 
QFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
GSPPSRVPQALNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 
BALL VD I KLE PLAVTPDAASQPL IDLPL I DFCDTPEAHVAVGSE 
SRPLIDLMTNTPDMNKNVAKPSPWGQLIDLSSPLIQLSPEADK 
ENVDSPLLKF 


5824 


42 


2293 


LLTALSMEGGGGRDB P S ACRAGD VNMDDP KKED I LLLADE KFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TS ES P FAWS PLAGEKFVEVYKEAHLLALH I ESSSRNQAAQAAKP 
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Predicted end 
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amino acid 
sequence 


Amino acid segment containing signal peptic[e~ 
(A=Alanine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine , M«Methionine , N=Asparagine , 
P«Proline, Q=Glut amine, R=Arginine, 
SaSerine, T«Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EDPRSQGVERFIQESKF\KINLFEKEKEMKKSPTSLKRETYYLS 
DSPLLGPPVGEPRLLASS PALPS SGAQARLTRAPGPPHSAHALP 
RESCTAHAASQAATQRKPGTKLLLPRAASVRGRGI PGAAEKPKK 
E I PAS PSRTKI PAEKE SHRDVLPDKPAPGAVNVPAAGS HLGQGK 
RAI P VP\NKLGLKKTLLKAPGS YSN\ T.OP K<z \ vwer' bqci 
CTPQPVAKAKSSEFAS I PAN* LPGLCPNISKS\GRMGPAMLRPA 
L\ PAGP VG\ASSWQAKRVDVS ELAAEQLTAPP\SAS PTQPQTPE 
GGG\QWLNSS CAWS ES SQLNKTRS IRRRDSCLNS KTKVMPTPTN 
Q FKI PKFS IGDS \ PDS STPKLS RAQRPQS CTS VGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
GSPPSRVPQALNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 
EALLVDI KLE PIAVTPDAASQPL I DLPLIDFCDTP EAHVAVGSE 
SRPLIDLMTNTPDMNKNVAKPS P WGQLI DLSS PL IQLS PEADK 
ENVDSPLLKF 


5825 


2 


4210 


FLQ I E S AS P AP FSS G F LAAH PHS PGGS LATKGRS RLS APGMLHL 
SAAP PAPPPE VTATARPCLCS VGRRGDGGKMAAAGALERSFVEL 
SGAERERPRH FREFTVCS I GTANAVAGAVKYS ESAGG FY YVES G 
KL FS VTRNR F I HWKTSGDTLE LME E S LD I NLLNNAI RLKFQNCS 
VLPGGVYVSETQNRVIILMLTNQTVHRLLLPHPSRMYRSELWD 
S QMQS I FTD I GKVD FTD PCNYQL I PAVPG I S PNSTASTAWLSS D 
GE AL FALP CAS GG I F VL KLP P YD I PGMVS WELKQS S VMQRLLT 
GWMPTAIRGDQS PSDRPLSLAVHCVEHDAF I FALCQDHKLRMWS 
YKEQMCLMVADMLEYVPVKKDLRLTAGTGHKLRLAYSPTMGLYL 
GIF\MHAPKRGQFCIFQLVSTESNRYSLDHISSLFTSQETLIDF 
ALTS TD I WALWHDAENQTWKY I NFEHNVAGQ WNP VFMQ PLPEE 
EIVIRDDQDPREMYLQSLFTPGQFTNEALCKALQIFCRGTERNL 
DLS WSELKKE VTLAVENE LQGS VTE YE FS QEE FRNLQQE FW CKF 
YACCLQ YQEALSHPLALHLNPHTNMVCLLKKGYLSFL I PSSLVD 
HLYLLP YENLLTEDETT I SDDVD IARDVICL I KCLRL I EES VTV 
DMSVIMEMSCYNLQSPEKAAEQILEDMITIDVENVMEDICSKLQ 
EI RNP IHA IGLL I REMD YETE VEMEKG FNPAQ PLNIRMNLTQLY 
GSNTAGYIVCRGVHKIASTRFLICRDLLILQQLLMRLGDAVIWG 
TGQLFQAQQDLLHRTAPLLLSYYLIKWGSECLATDVPLDTLESN 
LQHLSVLELTDSGALMANRFVSSPQTIVELFFQEVARKHIISHL 
FSQPKAPLSQTGLNWPEMITAITSYLLQLLWPSNPGCLFLECLM 
GNCOYVOLODYIOLLHPMCOVNVG^PT?Rivrr.nRPVT.\7 T TY3i7nnira^ 

ecfcqaasevgkeefldrlirsedgeivstprlqyydkvlrlld 
viglpelviqlatsaiteasddw\ksqatl\rtcifkhhl\dlg 
\hnsqaygsl* pqi pdssrqldclrqlvvvlcersqlqdlvefs 
yvnlhne WG I iesraravdlmthny yellyafh r YRHNYRKAG 

TVMFEYGMRLGREVRTLRGLEKQGNCYIiAALNCLRLIRPEYAWI 
VQPVSGAVYDRPGASPKRNHDGECTAAPTNRQIEILELEDLEKE 
CSLAR I RLTLAQHDPS AVAVAGS S S AEEMVTLLVQAGLFDTA I S 
LCQTFKL P LTP VFEGLAFKC I KLQ FGGE AAQAEAWAWLAANQLS 
SVITTKESSATDEAWRLLSTYLERYKVQNNLYHHCVINKLLSHG 
VPLPNWL INS YKKVDAAELLRLYLN YDLLDLTPYQVIRI CGC 


5826 


3 


871 


KS QLLRDHSAPP PKPCTSVGAMGC* PRQ / SPKEQQRQLKKQKNR 
AAAQRS RQ KHTDKAD ALHQQHE S LEKDNLALRKE I QSLQAE LAW 
WS RTLHVHERLCP MDCAS CS AFGLLGCWDQAEGLLG PGPQGQHG ' 
CREQLELFQTPGSCYPAQPLSPGPQPHDSPSLLQCPLPSLSLGP 
AWAEPP VQLSPSPLLFASHTGSSLQGS SS KLSALQPS LTAQTA 
PPQPLELEHPTRGKLGSSPDNPSSALGLARLQSREHKPALSAAT 
WQGLWDPSPHPLLAFPLLSSAQVHF 


5827 


194 


2287 


GMGSENSALKSYTLREPPFTLPSGLAVYPAVLQDGKFASVFVYK 
RENEDKVNKAAKVP* *HLKTLRHPCLLRFLSCTVEADGIHLVTE 
RVQPLEVALETLSSAE VCAG I YDILLAL I FLHDRGHLTHNNVCL 
S SVFVSEXK3HWKLGGMETVCKVSQATPEFLRS IQS IRDPAS I PP 
EEMS PE FTTLPECHGHARDAFS FGTLVESLLTILNEQVS ADVLS 
S FQQTLHSTLLNP I PKWRPALCTLLSHDFFRNDFLE WNFLKSL 
TLKS EEE KTE FFKFLLDRVS CLS EEL IAS RLVPLLLNQL VFAE P 
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amino acid 
sequence 


Predicted end 
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location 
corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SsSerine, ^Threonine, V«=Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








VAV\KSFIiPYLLGPKKDHAQGETPCLLS PALFQSRVI PVLLQLF 
EVHEEHVRMVLLSHIEAYVGALSiiREQLKKV\IL\PQVLLG\LR 
D\TSDSIVAITLHSLAVLVSLLGPEVWGGERTKIFKRTAP\SF 
TK\NTDLSLEGDPFSQPIKFPINGLSDVKNTSEDSENFPSSSKK 
SEEWPDWSGPE\EPENQTVNI\QIWP\REP\CDDVKSQCTTLDV 
EESSWDDCEPSSLDTKVNPGGGITATKPVTSGEQKPIPALLSLT 
EESMPWKSSLPQKISLVQRGDDADQIEPPKVSSQERPLKVPSEL 
GLGEE FT I Q VKKKP VKD P EMDW FADMI PE I KPSAAFL I LPELRT 
EMVPKKDDVSPVMQFSSKFAAAEITEGEAEGWEEEGELNWEDNN 
W 


5828 


2 


257 


AREGGSLGAVAACX5ELSYSCDFCPARPHTSWLTRFVKMEFQAW 
MAVGGGSRMTDLTSSIPKPLLPVGNKPLIWYPLNLLERVGFEEV 
IWTTRDVQKALCAEFKMKMKPDIVCIPDDADMGTADSLRYIYP 
KLKTDVLVLS CDLITDVALHBWDLFRAYDASLAMLMRKGQDS I 
EPVPGQKGKKKAVEQRDFIGVDSTGKRLLFMANEADLDEELVIK 
GS I LQKHPRI RFHTGLVDAHLYCLKKY I VDFLMENG\ S I TS IRS 
E L\ I P YLV/RGKQFS SAS SQQGTRKEKEGGS KGKRGLKS FRISY 
SFY*KEANYTGTGAPY\D\ACWI 


5829 


260 


1259 


PDGRLIVSCSEDKT IKI WDTTNKQCVNNFSDSVGFANFVDFNPS 
GTCI ASAGSDQTVKVWDVRVNKLLQHYQVHSGGVNC I S FHPSGN 
YL I TAS SDGTLK I LDLL KGRL I YTLQGHTG P VFTVS FS KGGEL F 
ASGGADTQVLLWRTNFDELHCKGLTKRNLKRLHFDS PPHLLDI Y 
PRTPHPHEEKVETVEDFFLHLLRLIQSLR* S ICRSLLPLLWISF 
LLILPQQQKPWGLCQTRVKRPVDIS*TLP*CHQNVCQQPRKRK 
QKT*VTSPVKVK/VSIPLAVTDALEHIMEQLNVLTQTVSILEQR 
LTLTEDKLKDCLENQQKLFSAVQQKS 


5830 


4496 


3139 


GGKMAAPEERDLTQEQTEKLLQFQDLTGIESMDQCRHTLEQHNW 
NI EAAVQDRLNEQEGVP SVFNP PPSRPLQVNTADHR I YS YWSR 
PQPRGLLGWGYYLIMLPFRFTYYTILDIFRFALRFIRPDPRSRV 
TDPVGDIVSFMHSFEEKYGRAHPVFYQGTYSQALNDAKRELRFL 
LVYLHGDDHQDS DEFCRNTLCAPE VI S LINTRMLFWACS TNKPE 
G YR VS QALRENT YP FLAM I MLKDRRE * PV\VGRLEGLI \QPDDL 
INQLTF I MDANQT YLVS ERLEREE RNQTQVLRQQQDEAYLAS LR 
ADQEKERKKREERERKRRKKEEVQQQKLAEERRRQNLQEEKERK 
LECLPPEPSPDDPESVKIIFKLPNDSRVERRFHFSQSLTVIHDF 
LFSLKESP\EKFQIEA\NFPRR\VLPCIPSEE\WPNPPTLQE\A 
GLS HTE VLFVQDLTDE 


5831 


71 


2897 


FCSKDKCCLYLPDSINRSKSCTAKPGAHSQDRHAVMDSERQVKD 
TDDI ESP KRS IRDSGYI DCWDSERSDS LSPPRHGRDDS FDS LDS 
FGSRSRQTPSPDWLRGSSDGRGSDSESDLPHRKLPDVKKDDMS 
ARRTSHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKKAEREEYR 
KSWSTATSPAGLGKKALQD YGPRT\ PVS \DDAES TSMFDMRCEE 
EAAVQPHSRARQEQLQLINNQLREEDDKWQDDLARWKSRKRSVS 
QDL I KKE E ERKKME KLLAG E D GTS ERRKS I KT YRE I VQEKERRE 
RELHEAYKNARSQEEAEGILQQYIERFTISEAVLERLEMPKILE 
RSHSTEPNLSSFLNDPNPMKYLRQQSLPPPKFTATVETTIARAS 
VLDTSMSAGSGSPSKTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 
VDGKVS VNGETVHREEE KE RE CPTVAP AHSLTKS QM FEG VARVH 
GS PL EL KQDNG S I E I N I KKPNS VPQELAATTEKTE PNSQE DKND 
GGKS RKGNI ELAS S E PQHFTTTVTRCS PTVAFVE FPS S PQLKND 
VSEEKDQKKPENEMSGKVELVLSQKWKPKSPEPEATLTFPFLD 
KMPEANQLHLPNLNS Q VD S PS S EKS P VTTPFKFWAWD PEE E RRR 
QEKWQQEQERLLQERYQ\KEQDK\LKEE\WEKAQKEVEEEERRY 
YE EE P * 1 1 \ EDP WPFTVS SS S ADQLS TS SSMTEGSGTMNKIDL 
GNCQDBKQDRRWKKSFQGDDSDLLLKTRESDRLEEKGSLTEGAL 
AHSGNPVSKGVHEDHQLDTEAGAPHCGTNPQLAQDPSQNQQTSN 
PTHSS EDVKPKTLPLDKS INHQ I ES PS ERRKS I SGKKLCS S CGL 
PLGKGAAMI IETLNLYFHIQCFRCG\ ICKGQLGDAVSGTDVRIR 
NGLLNCNDCYMRS RS AGQ PTTL 


5832 


2454 


829 


PGRRFRHGSCAFQKQCIMLHICQYFLQGECKFGTSCKRSHDFSN 
SENLEKLEKLGMS S DLVSRLPT I YRNAHD I KNKSSAPSRVPPLF 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W-Tryptophan, Y» Tyrosine, X<= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VPQGTSERKDS SGS VS PNTLS QEEGDQI CLYHI RKS CS FQDKCH ' 

RVHFHLPYRWQFLDRGKWEDIiDNMELIEEAYCNPKIERILCSES 

ASTFHSHCLNFNAMTYGATQARRLSTASSVTKP PHF I LTTDWI W 

YWSDEFGSWQEYGRQGTVHPVTTVSSSDVEKAYLAY/WYTGV+R 

PGSHLEVPGRKAQLRVRFQSLRSEKPGLWHN*KGLPQTQIR\AP 

QDVTTMQTCNTKFPGPKSIPDYWDSSALPDPGFQKITLSSSSEE 

YQKVWNLFNRTLPFYFVQKIERVQNLALWEVYQWQKGQMQKQNG 

GKAVDERQLFHGTSAI FVDAI CQQNFDWRVCGVHGTS YGKGS YF 

ARDAAYSHHYS KS DTQTHTMFLARVLVGE FVRGNAS F VRPPAKE 

GWSNAFYDSCVNSVSDPSIFVIFEKHQVYPEYVIQYTTSSKPSV 

TPSILLALGSLFSSRQ 


5833 


170 


3289 


SILCLLSPCWQFGKPWSILSSRSRHSPCTKKGWEGMRKHIiHT 
RQGHK* VHVE I S KAL W VYRDD YF I RHS I S VS AVI VRAW I THKYR 
GRDWNVKWEENL LHAVAKNYTLLQT I P P FER P FKDHQ VCLE WNM 
G Y I WNLRANR I P Q C P LEND WALLG FP YAS SGENTG I VKKFP RF 
RNRELE ATRRQRMD Y P VFT VS L W L YLLH Y CKANL CG I L YF VDS N 
EMYGTPSVFLTEEGYLHIQMHLVKGEDLAVKTKFIIPLKEWFRL 
D I S FNGGQ I WTTS IGQDLKS YHNQT I S FREDFH YNDTAG YFI I 
GGSR YVAG IEG FFGPLKYYRLRSLHPAQI FNPLLEKQLAEQ I KL 
YYERCAEVQE I VS VYASAAKHGGERQEACHLHNS YLDLQRR YGR 
PSMCRAFPWEKELKDKHPSLFQALLEMDLLTVPRNQNESVSEIG 
GKIFEKAVKRLSSIDGLHQISSIVPFLTDSSCCGYHKASYYLAV 
FYETGLNVPRDQLQGMLYSLVGGQGSERLSSMNLGYKHYQGIDN 
YPLDWELSYAYYSNIATKTPLDQHTLQGDQAYVETIRLKDDEIL 
KVQTKEDGDVFMWLKHEATRGNAAAQQRLAQMLFWGQQGVAKNP 
EAAIEWYAKGALETEDPALIYDYAIVLFKGQGVKKNRRLALELM 
KKAASKGLHQAVNGLGWYYHKFKKNYA\KAAKYWLKA\EE\MGN 
PDAS YNLGVLHLDGI FPG VPGRNQTLAGE YFHKAAQGGHMEGTL 
WCS LY YITGNLETFPRD P EKAWWAKH VAEKNG YLGHV I RKGLN 
AYLEGS WHEALLYYVIiAAETGI EVS QTNLAH I CEERPDLARR YL 
GVNCVWRYYNFSVFQ I DAPS FAYLKMGDLYYYGHQNQSQDLELS 
VQMYAQAALDGDSQGFFNLALLIEEGTIIPHHILDFLEIDSTLH 
SNNISILQELYERCWSHSNEESFSPCSLAWLYLHLRLLWGAILH 
SALI YFLGTFLLS I LIAWTVQ YFQS VSASDP PPRPSQAS PDTAT 
STAS PAVT PAADAS DQDQ PTVTNNP E PRG 


5834 


17 


4020 


RFRRGGGRVFPGAFPASPSDSLGQGNSQGPPRTPKPPRT/QECG 
SAAPGPIPGQSSS'VPLRLEQIQQKADCPLSLELALKPRMAAQV 
TLEDALSNVDLLEELPLPDQQPCIEPPPSSLLYQPNFNTNFEDR 
NAFVTG I ARY I EQATVHS SMNEMLEEGQE YAVMLYTWRS CS RAI 
PQVKCWEQPNRVEIYEKTVEVLEPEVTKLMNFMYFQRNAIERFC 
GEVRRLCHAERRKDFVSEAYLITLGKFINMFAVLDELKNMKCSV 
KNDHS A YKRAAQ FLRKMAD PQ S I QE S QNLSMFLANHNKITQS LQ 
QQL E VI SG YE B LLADI VNLCVD Y YENRM YLTPS EKHMLL KVMG F 
GL YLMDGS VSNI YKLDAKKR INLS K IDKYFKQLQWPLFGDMQ I 
ELARYI KTSAHYEENKSRWTCTSSGSS PQYNICEQMIQIREDHM 
RFI SELAR YSNSEWTGSGRQEAQKTDAEYRKLFDLALQGLQLL 
SQWSAHVMEVYSWKLVHPTDKYSNKDCPDSAEEYERATRYNYTS 
EEKFALVE VIAM I KGLQVLMGRMESVFNHAIRHTVYAALQDFSQ 
VTLME PLRQAI KKKKNVI QS VLQAI R KTVCDWE TGHE PFNDPAL 
RGEKDPKSG*DIKVPRRAVGPSSTQLYMVRTMLESLIADKSGSK 
KTLRSSLEGPTILDIEKFHRESFFYTHLINFSETLQQCCDLSQL 
WFREFFLELTMGRRIQFP I EMSMPW I LTDHILETKEASMME YVL 
YSLDLYNDSAHYALTRFNKQFLYDE IEAEVNLCFDQFVYKIiADQ 
I FAYYKVMAGS LLLDKRLRS ECKNQGAT IHLPPSNR YETLLKQR 
HVQLLGRSIDLNRLITQRVSAAMYKSLELAIGRFESEDLTS IVE 
LDGLLEINRMTHKLLSRYLTLDGFDAMFREANHNVSAPYGRITL 
HVFWE LNYDFLPNYCYNGS TNRFVRTVLP FS QE FQRDKQPNAQ P 
QYLHGSKALNLAYS S I YGS YRNFVGP PHFQVICRLLG YQG1AW 
MEELLKWKSLLQGT ILQYVKTLMEVMP KI CRLPRHE YGSPGI L 
EFFHHQLKDIVEYAELKTVCFQNLREVGNAILFCLLIEQSLSLE 
E VCDLLHAAP FQN I LPRVH VKEGE RLDAKMKRLE S KYAPLHLVP 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
PsProline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIERLGTPQQIAIAREGDLLTKERLCCGLSMFEVILTRIRSFLD 
DP I WRGPLPSNGVMHVDECVEFHRLWSAMQFVYC I P VGTHE FTV 
EQCFGDGLHWAGCMI IVLLGQQRRFAVLDFCYHLLKVQKHDGKD 
E 1 1 KNVPLKKMVER I RK FQ I LNDE 1 1 T I LDKYLKSGDGEGT P VE 
HVRCFQPPIHQSLASS 


5835 


4209 


1904 


SGN IRMAQGSHQ I DFQVLHDLRQKFPEVPEVWSRCMLQNNNNL 
DACCAVLSQESTRYLYGEGDLNFSDDSGISGLRNHMTSLNLDLQ 
SQNIYHHGREGSRMNGSRTLTHSISDGQLQGGQSNSELFQQEPQ 
TAPAQVPQGFNVFGMSSSSGASNSAPHLGFHLGSKGTSSLSQQT 
PRFN P I MVTLAPN I QTGRNT PTSLH I HG VPP P VLNS PQGNS I Y I 
RPYITTPGGTTRQTQQHSGWVSQFNPMNPQQVYQPSQPGPWTTC 
P ASNP LSHTS SQQ PNQQGHQTSHVYMP I S S PTTS Q P PT IHS SGS 
b QSSAnoQ xN 1QN I fa ITjPRKNQ IE I KLE PPQRNNSSKLRSSGPR 
TSSTSSSVNSQTLNRNQPTVYIAASPPNTDELMSRSQPKVYISA 
NAATGD E Q VMRNQ P TLF I S TN S GASAAS RNMSGQ VS MG PAF I HH 
HPPKSRAIGNNSATSPRVWTQPNT\EYTFKITVSPNKPPAVS? 
GWS PT FELTNLLNHPDHYVETEN IHHLTDPTLAHVDRIS ETRK 
LSMGSDDAAYTQDI * RISNS WLGM VAHACNSSALGGQDGR 1 1 + A 
QEFET S WGNI WRLR L YRRF * N YAGMVAHTCS PS YS VD * ALL VHQ 
KARME RLQ RELE I Q KKKLDKL KSEVNE MENNLTRRRLKRSNS I S 
QIPSLEEMQQLRSCNRQLQIDIDCLTKEIDLFQARGPHFNPSAI 
HNFYDNIGFVGPVPPKPKDQRSIIKTPKTQDTEDDEGAQWNCTA 
CTFLNHPALIRCEQCEMPRHF 


5836 


361 


2303 


FHITMCGICCSVNFSAEHFSQDLKEDLLYNLKQRGPNSSKQLLK 
S D VNYQ CL FS AHVLHLRGVLTTQPVED E RGNVFL WNGE I FSG I K 
VEAEENDTQILFNYLSSCKNESEILSLFSEVQGPWSFIYYQASS 
HYLWFGRDFPGRRSLLWHFSNLGKSFCLSSVGTQTSGLANQWQE 

vrno \UC oIj.uJ.1joJ-j.1joJ: rUALiff IHlwl JjuW lr IJoKlijlj^lVrlblA* 

VKFQQT YQHLYQR* QMKPNC I LKNLLFL * I * CCHKLH WRLI AVI 
FPMCHLQERYFKSFLLMYT*KEVIQQFIDVLSVAVKKRVLCLPR 
DENLTANEVLKTCDRKANVAILFSGGIDSMVIATLADRHIPLDE 
P I DLLNVAFIAE EKTMPTTFNREGNKQKNKCEI PS EE FS KDVAA 
AAADS PNKHVSVPDRITGRAGLKELQAVSPSRIWNFVEINVSME 
ELQKLRRTR ICHLI RPLDTVLDDS IGCAVWFASRG IGWLVAQEG 
VKS YQSNAKWLTG I G ADEQLAG YSRHRVRFQSHGLEG LNKE I M 
ME LGR I S S RNLGRDDRVIGDHGKE ARF P FLDENWS FLNS LP I W 
E KANLTL PRG IGE KLLLRLAAVE LGLTAS ALLPKRAMQFGSR I A 
KMEKINEKASDKCGRLQIMSLEKLSIEKETKL 


5837 


4792 


903 


NGNAVAQAPVTNCC YLATGS KDQTIR I WS CSRGRGVM I LKLPFL 
KRRGGG I D PTVKERL WLTLHW P SNQPTQL VS S CFGGELLQWDLT 
QS WRR KYTLFS ASS EGQNHSR I VFNLCPLQTEDDKQLLLS TSMD 
RDVKCWDIATLECSWTLPSLGG FAYS LAPS SVDIGS LAI GVGDG 
MIRVWNTLSIKNNYDVKNFWQGVKSKVTALCWHPTKEGCLAFGT 
DDGKVGLYDTYSNKPPQISSTYHKKTVYTLAWGPPVPPMSLGGE 
GDRPSLAL YS CGGEG I VLQHN P WKLSGE AFD I NKL I RDTNS I K Y 
KLPVHTBISWKADGKIMALGNEDGSIEIFQ\IPNLKLICTIQQH 
HKLVNTISWHHE\HGSPAQKLSYL\MPSGSQQCSPFTCHNLKNC 
P*KAAPESPSDPLQSPYRTPPQGHTAQDYPVWAWEPHIH*WEGL 
VFCFPIDGYSPGCWD\AFPGKEAPVAIFRG\HQGRLLCVAWSPL 
DPD CI YS G \ ADDFCVHKWLTS MQDHSR P PQG KKS I ELE KKRLSQ 
PKAKP KKKKKPTLRT P VKLES I DGNEEE S M KEN SGPVENG VSDQ 
EGEEQAREPELPCGLAPAVSREPVICTPVSSGFEKSKVTINNKV 
ILLKKEPPKEKPETLIKFCRKARSLLPLSTSLDHRSKEELHQDCL 
VLATAKHSRELNEDVSADVEERFHLGLFTDRATLYRMIDIEGKG 
HLENGHPELFHQLMLWKGDLKGVLQTAAERGELTDNLVAMAPAA 
GYHVWLWAVEAFAKQLCFQDQYVKAASHLLS I H KVY E AVE LLKS 
NHFYREAIAIAKARLRPEDPVLKDLYLSWGTVLERDGHYAVAAK 
CYLGATCAYDAAKVLAKKGDAAS LRTAAELAAI VGEDELS AS LA 
LRCAQELLIjANNWVGAQEALQLHESLQGQRLVFCLLELLSRHLE 
EKQLSEGKSSSSYHTWNTGTEGPFVERVTAVWKSIFSLDTPEQY 
QEAFQKLQN I KYPS ATNNTPAKQLLLHI CHDLTLAVLSQQMAS W 



384 



WO 01/5331 2 PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
re s i uue u jl 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M«Methionine, N«Asparagine , 
P«Proline, Q=Glutamine, R=Arginine, 
g-ocrincf x-inreoniiic, v-vaiine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DEAVQALLRAWRSYDSGSFTIMQEVYSAFLPDGCDHLRDKLGD 
HQS PATPAFKSL EAF FLYGRL YEF WWS LS R P C PNSS VW VRAGHR 
TLSVEPSQQLDTASTEETDPETSQPEPNRPSELDLRLTEEGERM 
LSTFKELFSEKHASLQNSQRTVAEVQETLAEMIRQHQKSQLCKS 
TANGPDKNEPEVEAEQPLCSSQSQCKEEKNEPLSLPELTKRLTE 
ANQRMAKFPES I KAW PFPDVLECCLVLLL I RSHFPGCLAQEMQQ 
QAQELLQKYGNT KT YRRHCQT FCM 


5838 


110 


98 


KTMPHLLVTFRDVAIDFSQEEWECLDPAQRDLYRDVMLENYSNL 
ISLDLESSCVTKKLSPEKEIYEMES\PSGRIWGNVSTITFQYNG 
LGDNMECKGNLEGQVSKSEGLYMCVKITCEEKATESHSTSSTFH 
RI I / H YQG K I VKC KE CRQG FS YLS CL I QHEENHN I * KCS E VNKH 
RNTFS KKPS Y I * HQ \ KFRLGEKP YECMECGKAFGRTSDLIQHQK 
IHTNEKPYQCNACGKAFIRGSQLTEHQRVHTGEKPYDCKKCGKA 
FSYCSQYTLHQRIHSGEKPYECKDCGKAFILGSQLTYHQRIHSG 
EKP YE C KE CGKAF I LGSHLTYHQRVHTGE KP Y I CKE CG KAFLCA 
SQLNEHQRIHTGEKPYECKECGKTFFRGSQLTYHLRVHSGERPY 
KCKE CGKAF I SNSNL I QHQRIHTGE KP YKCKECGKAF I CGKQLS 
EHQR I HTGE KP FE CKE CGKAF IRVAYLTQHE K I HGE KH YE C KEC 
GKTF VRATQLTYHQR I HTGE KP YKC KE CD KAF / HLWLT I LS E HQ 
RIHRGEKPYECKQCGR/LFIRGSHL/NEHLRTHTGEKPYECKEC 
GRAFSRGSEHTLHQRIHTGEKPYTCVQCGKDFRCPSQLTQHTRL 
HN*EYSSHKICMHSLALASLDFAHLQEKNPEN 


5839 


1 


2425 


GRPFPRPPRALPRLPLRGRRQDGRWTVDFEECLKD\SPRFRAAL 
EE VE GD VAE LELKL\ DKLVKLC I A\M I DTGKAFCVANKQ FMNG I 
RD\LAQNS\NNDA\WETKFAPSFLDSLQEMINFHTIL/Ij*PNS 
EIN*GHS FQNFVKEDLRKFKDAKKQFENSQ* KRKKIALVKNAPV 
PSRPASLEL*KPPNILTATRKCFRHIALDYVLQINVLQSKRRSE 
ILKSMLSFMYAHLAFFHQGYDLFSELGPYMKDLGAQLDRLVGDA 
AKEKREMEQKHSTIQQKDFSRDDSKLKYNVDAANGIVMEGYLFK 
RASNAFKTWNRRWFS IQNNQWYQKKFKDNPTVWEDLRLCTVK 
HCED I ERR FC FE WS PT KS CMLQADS E KLRQ AW I KAVQTS I \ AT 
AYREKDDESEKLDKKSSPSTGSLDSGNESKEKLLKGESALQRVQ 
CIPGNASCCDCGLADPRWASINLGITLCIECSGIHRSLGVHFSK 
VRSLTLIXTWEPELLKLMCEIiGNDVINRVYEANVEKMGIKKPQPG 
QRQE KEAYIRAKYVERKFVDKI FL* SLS PP\EQQKK\FVS KSS E 
EKRLSISKFGP\GDQVRASAQSSVRSNDSGIQQSSDDGRESLPS 
TVSANSLYEPEGERQDSSMFLDSKHLNPGLQLYRASYEKNLPKM 
AEALAHG ADVNWANS EENKAT PL I QAVLGGSLVTCE FLLQNGAN 
VNQRD VQGRGPLHHATVLGHTGQVCLFLKRGANQHATDE EGKD P 
LSIAVEAANADIVTLLRLARMNEEMRESEGLYGQPGDETYQDIF 
RDFSQMASNNPEKLNRFQQDSQKF 


5840 


698 


3610 


KHLHLPRQHLTTLWQ I S S PRWRS EQRAFMSALS KTQTQSAPALQ 
GLSSLLQSVTGNPVPASEAASQSTSASPANTTVYTIKGRNLPSS 
AQPFIPKSFNYSPNSSTSEVSSTSASKASIGQSPGLPSTAFKLP 
SNTKGFTATHNTSPAAPPTEVTICQSSEVSKPKL\ESESTSPSL 
\EMKIHNFLKGNPGFSVA*NLKHPNPAGSLGSSAPSESHPSDFQ 
RGPTSTSIDNIDGTPVRDERSGTPTQDEMMDKPTSSSVDTMSLL 
SKIISPGSSTPSSTRSPPPGRDESYPRELSNSVSTYRPPGLGSE 
SPYKQPSDGMERPSSLMDSSQEKFYPDTSFQEDEDYRDFEYSGP 
PPSAMMNLQKKPAKSILKSSKLSDTTEYQPILSSYSHRAQEFGV 

SPSKNDSFFTPDSNHNSLSQSTTGHLSLPQKQYPDSPHPVPHRS 
LFSPQNTIJ^APTGHPPTSGVEKVLASTISTTSTIEFKNMLKNAS 
RKPSDDKHFGQAPSKGTPSDGVSLSNLTQPSLTATDQQQQBEHY 
R I ETRVS SS CLDL PDS TE E KGAP I ETLG YHS ASNRRMS GE P I QT 
VE S IR VPGKGNRGHGREAS RVGWFDLS T SGSSFDNGPS S AS ELA 
SLGGGGSGGLTG FKTAPYKERAPQFQES VGS FRSNS FNSTFEHH 
L P PS P LEHGTP FQRE P VG P S S APPVP P KDHGG I FS RDAPTHL P S 
VDLSNPFTKEAALAHAAPPPPPGEHSGIPFPTPPPPPPPGEHSS 
SGGSGVPFSTPPPPPPPVDHSGWPFPAPPLAEHGVAGAVAVFP 
KDHSSLLQGTLAEHFGVLPGPRDHGGPTQRDLNGPGLSRVRESL 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q-Glut amine, R«Arginine, 
S»Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TL PS HS LEHIjG PPHGGGGGGGSNS SSGP P LG PSHRDT I S RSG 1 1 

LRSPRPDFRPREPFLSRDPFHSLKRPRPPFARGPPFFAPKRPFF 
PPRY 


5841 


1908 


762 


GLRLFLVLTVWPMMKPSWLSRTEFSKRLLCRTLWCQSGWSSRSY 
TRSMLKMTTS INRRSRTSTKSTRTSARPGLTATVS IGLSDSPTW 
RHCWMTARS CS GEKGGHWAPRQVG VYLLPGRVGCVS S RVS PS FP 
GDGLDSG LARRGSAVS ALASGLVE E PMLGPP FHPT PR F KAVSAK 
S KEDLVSQGFTEFTI EDFHNTFMDL IEQVEKQTSVADLLASFND 
QSTSDYLWYLRLLTSGYLQRESKFFEHFIEGGRTVKEFCQ\QE 
\VEPMCKESDHIHIIALAQGLQRVHPGWEYMGPRPRAATTNPHI 
FP*GLPSPKVYLLYRPG\HYDILYKIGLGSSPLGCPGCPLLARA 
LGHCYRGFSVWKWSYFTPFFLSHDPPPMFY 


5842 


307 


1918 


QE P TADFKLR S TCGCGREMTC PD KPGQL INW F I CS LC VPR VRKL 
WSSRRPRTRRNLLLGTACAIYLGFLVSQVGRASLQHGQAAEFCGP 
HRS RDTAEPS FPE IPLDGTLAP PESQGNGSTLQPNWY I TLRSK 
RS KP ANIRGT VKP KRRKKHAVAS AAPGQEALVG P S LQ P QEA\ EG 
KLM L * HIiGTLR EQTWLRLE S DPG GWCG VRE / WRAGG P D FLQPSS 
RESNIRIYSESAPSWLSKDDIRRMRLLADSAVAGLRPVSSRSGA 
RLLVLEGGAPGAVLRCGPS PCX3LLKQPLDMSEVFAFHLDRI LGL 
NRTLPSVSRKAEFIQDGRPCPIILWDASLSSASNDTHSSVKLTW 
GTYQQLLKQKCW QNGR VP KPESG CTE IHHHEWS KMAL FDFLLQI 
YNRLDTNCCGFRPRKEDACVQNGLRPKCDDQGSAALAHIIQRKH 
DPRHLVFIDNKGFFDRS EDNLNFKLLEGI KEFPAS AVYVLKSQH 
LRQ KLLQS LFLDKG YWE SQGGRQG I E KL ID V I EHRAKI L I T Y I N 
AHGVKVLPMNE 


5843 


500 


1453 


GTARLVTCWVLHGQ*VKKPAWEPGWWL*Q*RCRPKGWGLGAGM 
RGSRMSQPPQCLRRAQSSCCHFMVKLLDDGTFMIPGEKVAHTSL 
DALVTFHQQKP I EPRRELLTQPCRQKDPANVDYEDIiFLYSNAVA 
EEAACP VS AP EEAS PKP VL CHQS KERKP S AEM / RQNNHQG S H F L 
LPP K I PS WRD P P ETLEE PQNAP RERP EG PAAAKKP PRHCE L WT 
LGC P E I HGDLR P WDRKRQ PRS LRGS HLGGQRLHGS LCGH I S QKP 
LTAPGTKRQKGPHQEGREVGQLH*GDPRGOJSLAPNGSESPILPG 
VQARAPGLGRA 


5844 


202 


2471 


FDSAVLSSINVMAVLPGPLQLIjGVLLTISLSSIRLIQAGAYYGI 
KPLPPQIPPQMPPQIPQYQPLGQQVPHMPLAKDGLAMGKEMPHIi 
QYGKEYPHLPQYMKEIQPAPRMGKEAVPKKGKEIPLASLRGEQG 
PRGEPGPRGPPGPPGLPGHGIPGIKGKPGPQGYPGVGKPGMPGM 
PGKPG AMGMPGAKGE I GQKGE IGPMG I P * P QG P PGP HGLPG IG K 
PGGPGLPGQPGPKGDRGPKGLPGPQGLRGPKGDKGFGMPGAPGV 
KGPPGMHGPPGPVGLPGVGKPGVTGFPGP\QGPLGK\PGAPGEP 
GPQGPIGVPGVQGPPGIPGIGKPGQDG\IPGQPGFPGGKGEQGL 
PGLPGP PGLPG I GKPGFPGPKGDRGMGGVPGALG PRGEKGP IGA 
PGIGGPPGEPGLPGIPGPMGPPGAIGFPGPKGEGGIVGPQGPPG 
PKGEPGLQGFPGKPGFLGEVGPPGMRGFPGPIGPKGEHGQKGVP 
GLPGVPGLLGPKGEPGIPGDQGLQGPPGIPGIGGPSGPIGPPGI 
PGPKGEPGLPGPPGFPGIGKPGVAGLHGPPGKPGALGPQGQPGL 
PGPPGPPGPPGPPAVMPPTPPPQGEYLPDMGLGIDGVKPPHAYG 
AKKG KNGG P AYEMPAFTAELTAP FP P VGAP VKFNKLL YNGRQN Y 
NPQTG I FTCE VPGVY Y FAYHVHCKGGNVW VALFKNNE PVMYTYD 
E YKKGFLDQASGSAVLLLRPGDRVFLQMP S EOAAGLYAGQYVHS 
SFSGYLLYPM 


5845 


215 


2061 


HASNKSASLQDKMANPKEKTAN1CLVNELARFNRVQPQYKLLNER ' 
G PAHS KM FS VQ LSLGEQTWES EGSS I KKAQQAVGNKALTE S TLP 
KP I * KPPKSNVNNNPGC ITPTVELNGLAMKRG \ KPAIHRPLDPK 
P FPNNRANYN FQ VM YNQR YHCP I PKI FYVQLTVGNNEFFGEGKT 
RQAARHNAAMKALQALQNEPIPERSPQNGESGKDMDDDKDANKS 
E I SLVFE IALKRNMPVS FEVIKESGPPHMKS FVTRVS VGE FS AE 
GEGNS KKLS KKRAATTVLQELKKLP PLPVVEKPK\HF FKKR P KT 
I VKAG PE YGQGMNP I SRLAQIQQAKKEKE PDYVLLSERGMPRRR 
EFVMQVKVGNEVATGTGPNKKIAKKNAAEAMLLQLGYKASTNIjQ 
DQLEKTGENKGWSGPKPGFPEPTNNTPKG I LHLS PDVYQEMEAS 
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Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








RHKVISGTTLGYLSPKDMNQPSSSFFSISPTSNSSATIARELLM 
NGTSSTAEAI GLKGSS PTPPCS P VQPS KQLEY LAR I QGFQVHYC 
DRQSGKECVTCLTLAPVQMTFHA I GSS IEASHDQV* YATAILLC 
YGPARKWKAI KMEAMCAHAALLSLIHYLLAPSARLEKSKLFALG 
N- 


5846 


1126 


456 


FSKLIMKTFIIGISGVTNSGKTTLAKNLQKHLPNCSVISQDDFF 
KPESEIETD KNGFLQ YDVLE ALNME KMMS A I S CWME SARHS WS 
TDQESAEEIPILIIEGFLLFNYKPLDTIWNRSYFLTIPYEECKR 
RRS TRVYQP PDS PG YFDGHVW PM YLKYRQEMQD I TWEWYLDGT 
KSEEDLFLQVYEDLIQELAKQKCLQVTA*RRNTTNPS /CK* IRK 
LQGVI 


5847 


2769 


505 


APEMEDLSS PDSTLLQGGHNLLSSAS FQESVTFKDVI VDFTQEE 
WKQLDPGQRDLFRDVTLENYTHLVSIGLQVSKPDVISQLEQGTE 
PWIMEPSIPVGTCADWETRLENSVSAPEPDISEEELSPEVIVEK 
HKRDDSWSSNLLESWEYEGSLERQQANQQTLPKEIKVTEKTIPS 
WEKGPVNNEFGKSVKVSSNLVTQEPSPEETSTKRSIKQNSNPVK 
KE KS CKCNECG KAFS YCS ALIRHQRTHTGE KP Y KCN * / CVEKAF 
SRSENtilNHQRIHTGDKPYKCDQCGKGFIEGPSLTQHQRIHTGE 
KPYKCDECGKAFSQRTHLVQHQRIHTGEKPYTCNECGKAFSQRG 
H FMEHQKI HTGE KP FKCDECDKTFTRS THLTQHQ KI HTG E KT YK 
CNECGKAFNGPSTFIRHHMIHTGEKPYECNECGKAFSQHSNLTQ 
HQ KTHTGE KP YD CAECGKS FS YWS SLAQHLKI HTGEKP YKCNEC 
GKAFSYCSSLTQHRRIHTREKPFBCSECGKAFSYLSNLNQHQKT 
HTQE KAYE CKECGKAF I RS S SLAKHER I HTGE KP YQCHECGKTF 
SYGSSLIQHRKIHTGERPYKCNECGRAFNQNIHLTQHKRIHTGA 
KPYECAECGKAFRHCSSLAQHQKTHTEEKPYQCNKCEKTFSQSS 
HLTQHQRIHTGE KP YKCNECD KAFS RS THLTQHQR I HTGE KP YK 

CNECGK\TFSQSTYLIQHQRIHSGEKPFGCNDCGKSFRYRSALN 
KHQRLHPGI 


5848 
£849 


22 


2961 


AAPRRLLRGGDGDRTPRFPLPALLRPGPPAEAAPERRKMPAVSK 

GDGMRGLAVF I SD I RNCKSKEAEIKR INKELANI RS KFKGDKAL 

DGYSKKKYVCKLLFIFLLGHDIDFGHMEAVNLLSSNRYTEKQIG 

YLF I S VLVNSNS ELIRL INNAI KNDLAS RNPTFMGLALHC I AS V 

GSREMAEAFAGE I PKVLVAGDTMDSVKQSAALCLLRLYRTS PDL 

VPMGDWTSRWHLLNDQHLGWTAATSLITTLAQKNPEEFKTSV 

S LAVS RLS \ R I VTS AS TDLQD YT Y * FC PG FLGL S VKLLRLLQC Y 

PPPDPAVRGRLTECLETILNKAQEPPKSKKVQHSNAKNAVLFEA 

ISLIIHHDSEPNLLVRACNQLGQFLQHRETNLRYLALESMCTLA 

SS EFSHEAVKTHI ETVINALKTERDVS VRQRAVDLLYAMCDRSN 

APQI VAEMLS YLETADYS IREE I VLKVAI LAEKYAVD YTW\ YVD 

TI LNL I R I AGD YVS EE VWYRVI Q I VINRDDVQGYAAKTVFEALQ 

APACHENLVKVGGYILGEFGNLIAGDPRSSPLIQFHLLHSKFHL 

CSVPTRALLLSTYIKFVNLFPEVKPTIQDVLRSDSQLRNADVEL 

QQRAVEYLRLSTVASTDILATVLEEMPPFPERESSILAKLKKKK 

GPSTVTDLEDTKRDRSVDVNGGPEPAPASTSAVSTPSPSADLLG 

LGAAPPAPAGPPPSSGGSGLLVDVFSDSASWAPLAPGSEDNFA 

RFVCKNNGVLFENQLLQIGLKSEFRQNLGRMFIFYGNKTSTQFL 

NFTPTLICSDDLQPNLNLQTKPVDPTVEGGAQVQQWNIECVSD 

FTEAPVLNIQFRYGGTFQNVSVQLPITLNKFFQPTEMASQDFFQ 

RWKQLSNPQQEVQNI FKAKHPMDTE VTKAK I IGFGSALLEE VDP 

NPANFVGAG 1 1 HTKTTQ I GCLLRLE PNLQAQMYRLTLRTS KEAV 

SQRLCELLSAQF 




3545 


1895 


KRRE I KE TVFHH VAQAGLE LLSS SN P P SSAS RS AG ITGMRHQVQ 
P*DPCMSLSPPCFTEEDRFSLEALQTIHKQMDDDKDGGIEVEES 
DEFIREnMKYKDATNKHSHLHREDKHITIEDLWKRWKTSEVHNW 
TLEDTLQWLIEFVELPQYEKNFRDNNVKGTTLPRIAVHEPSFMI 
SQLKI SDRSHRQKLQLKALDWLFGPLTR PPHNWMKD Fl LTVS I 
VI GVGGCW F A YTQNKTS KEHVAKMMKDLiESLQTAEQSLMDIjQER 
LE KAQE ENRNVAVEKQNL * RKMMDE I N YAKEEACRLRE LREGAE 
CELSRRQYAEQELEQVRMALKKAEKEFEIiRSSWSVPDALQKWLQ 
LTHEVEVQYYNI KRQNAEMQLAIAKDEAEKI KKKRSTVFGTLHV 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E== 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AHSSSLDBVDHKILEAKKALSELTTCLRERLFRWQQIEKICGFQ 
IAHNSGLPSLTSSLYSDHSWWMPRVSIPPYPIAGGVDDLDEDT 
PPIVSQFPGTMAKPPGSLARSSSLCRSRRSIVPSSPQPQRAQLA 
PHAPHPSHPRHPHHPQHTPHSLPSPDPDILSVSSCPALYRNEEE 
EEAI YFS AEKQWEVPDTASECDSLNSS IGRKQS PP / SKPRD I PN 
IIS/DERYQEMRCP*RIPSGGIL 


5850 


3 


1895 


KAVLNFSASGSVISLTGSNPMHDASMWHLKKNGIIVYLDVPLIiN 
LICRLKLMKTDRIVGQNSGTSMKDLLKFRRQYYKKWYDARVPCE 
SGASPEEVADKVLNAIKRYQDVDSETFISTRHVWPEDCEQKVSA 
EFFIEAVIEGLASDGGLFVPAKEFPKLSCGEWKSLVGATYVERA 
Q I LLERCIHPAD I PAARLGEM I ETAYGENFACS KIAPVRHLSGN 
Q F I LEL FHG P TG S F KDLS LQLMPH I FAQ C I P PS CNYM I LVATS G 
DTGSAVLNGFSRLNKNDKQRIAWAFFPENGVSDFQKAQIIGSQ 
RENG WAVG VE S DFD FCQTA I KR I FNDSD FTG FLT VE YGT I LS S A 
NSINWGRLLPQWYHASAYLDLVSQGFISFGS PVDVCI PTGNFG 
N I LAAVYAKMMG I P I R KF I CASNQNHVWTD F I KTG \ H YDLRG KE 
N* AQTFFTVQ * I FL PNLSNLERHLHLMANKDGQLMT E L FNRLE S 
QHHFQIEKALVEKLQQDFVADWCSEGECLAAINSTYNTSGYILD 
PHTAVAKWADRVQDKTCP VI ISSTAHYS KFAPAIMQALK IKE I 
NETSSSQLYLLGSYNALPPLHEALLERTKQQEKMEYQVCAADMN 
VLKSH VEQL VQNQ F I 


5851 


3120 


1802 


RCYLQFLALLLTSTSARAAAAI AAAEEPAGS PSVMTRAGDHNRQ 
RGCCGSLADYLTSAKFLLYLGHSLSTWGDRMWHFAVSVFLVELY 
GNS LLLTAVYGLWAGS VLVLGAI I GDWVDKNARLKVAQTS LW 
QNVSVILCGI I LMM VFLHKHE LLTMYHG WVLTS CY I L I ITIANI 
ANLASTATAITIQRDWIVWAGEDRSKLANMNATIRRIDQLTNI 
LAPMAVGQI MTFGSPVIGCGFI SGWNLVSMCVE YVLLWKVYQKT 
PALAVKAGLKEEETELKQLNLHKDTEPKPLEGTHLMGVKDSNIH 
ELBHEQEPTCASQMAEPFRTFRDGWVSYYNQPVF/LGWHGSCFP 
LYDCPGL*LHHHRVRLHSGTEWFHPQ YFDGS IS YNWNNGNCS FY 
LATSKMWFGSDRSDLRIGTAFLFDLVCDLCIHAWKPPGLVRFSF 


5852 


1 


422 


KTTFPSSLCPLRQLPEVRGYSGQPLTDPLISLCRSHKCRGKGWG 
SSSYPSLPALLRARSAPGHCTHRSCGPEWRIDSISRLEMQGARR 
SGWAQAQPTILLLVPRLRKSLPSIWG/SLMGPFITSGPG/WFRQ 
YYFFISGRH*VLFTESDFYYVAMDFGGHGLSSKYSPGVPYYLQT 
FVSE I RRWAGKKQS VYFRRCGGCSRAP PLITGGGVGSRKQRWP 
ESGAWALAPGL PAI HGRS WES 


j 5853 


223 


1346 


RLLGLS R VKGLHG PAASAW I SD PETRGD PGGP WGMWRGSDLRPR 
PVSLTGLTLVCK*AAQGPQV\HSVKLCFGLGG\PCLL\FPIFRP 
LLLHPRRPRLHPGTRGVAVEPHALRVVHVAHGEEAGIRAAGPGH 
GGVE I PQG/ VGSLGARRGLRPSRPSSRHRNRVPAPPPGRPLATP 
HRRRFPPDPALTCPGLGQDQGPREQQKQGSGRHDTILGDWGESE 
SRWVRGN FRTGTAATL I GFS RNP TLNGS ENWGSLVS I QEEGP DT 
GWEREKRNPAEMGNPQRWASPIHTPPLGPEILRAMPEALRAMPE 
ALGLRPD PATS VPS ALS /QTF/PESWPRS CLRNQGETLGMG P VP 
LS SLC I TES PSQNWTPCLLLLTCPRGLF 


5854 


86 


93 8 


KG RNTAP EKKGAALNNRENAS S * NG Y / S RWKQD I RRI ENH I IQE 
LKHLCAM I KRVLLERLENTRKLRELTEGRTLDWPQNR ITEVSAK 
RQ I VTE YRE KG KRN * E E KKRDLEGRSRR YNLCI IG I PETEDRAS 
GAET IKDhhE/ ENFPELKNELDLQME KAHR I PLKFNE KKAASRH 
I R VTF L / KFQRRN I LQAS SQR KQ VT YKG AKVRLTS DFS P AI LNA 
RRQW/N/PISRVLRENNFEPRIIYSAKLSFLYKGNWKTFLDIQG 
LGKYINQELSLKILLKDLLQLTENLN 


5855 


536 


2391 


LRS YGC KAP SR I SHLHK \ FL FLLLP S LLMG YS E S P PP I TDS WAP 
FISLTHHVLSQSQSPLSSNCWICLSTHTQ*FTALPADLLTWTQS 
NVSLHISYLAIPFLADSFLKPV/L*PGNSAKHLSFKLSSLSMVS 
GRAVALLHLIASGLTSIQTNTASSKPPIWGY\LSTQTSFISPPP 
LCLSRTYPNPAHATMVGQVPQSLCGLIFTL/RTPCRPSILHPNY 
KIISTSAWQKVLCFSGSPTIHTSLHLTTGSSFLSFHPIPGFPAA 
NSALYVSSLKGPPGKNVTIPSPVTGT+QPPHRGSN/RLTVDKDN 
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Amino acid segment containing signal peptide " 
(A=Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N-Asparagine , 
P=Proline, Q=Glutamine, RoArginine, 
S=Serine, T=Threonine, V^Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FFLSPKPNSLHQLPSQ\TPYQALTGAAIAGSYPIWENENTLSWL " 
PTFTYNFCLSTPSLFFLCDTN*YLCLPANWSGTCTLVFQAPTIN 
ILPPNQTILISVEASISSSPIRNKWALHLITLLTGLGITAALGT 
GIAGITTS ITS YQTLFTTLSNTVEDMHTS ITSLQRQLDFLVGVI 
LQNWRVLDLLTTEKGGTCIYLQEECCFCVNESGIVHIAVRRLHD 
RAAEL * HQVADS W WQGS S LLRW I P W VAPFLG PL I FLFLLLM I G P 

CIFNLVSRFISQRLNCFIQASMQKHIDNIFHLCHV*YQSLRGNH 
SEAPEPRP 


5856 


173 


1137 


PWLHGLGLSAVFLFYL*/YVTFHLYGGIILLLLIFISIAGILYK 
FQD VLL YF P EQPS S S R L YVPMPTG I PHENI F IRTKDG I RLNLI L 
IRYTGDNSPYSPTIIYFHGNAGNIGHRLPNALLMLVNLKVNIiLL 
VD YRG YGKS EGE AS E EG L YLDS EAVLD Y VMTS P DLDKTK I YLS G 
RS LG \GAAAI HLAS DNSHRI S A I MVENTFLS I PHMAS TL FS FFP 
MRYLPLWCYKNKFLSYRKISQCRMPSLFISGLSDQLIPPVMMKQ 
LYELS PSRTKRLAIFPDG THNDTWQ CQG YFTALEQF I KE WKS H 
S PEEMAKTSSNVTII 


5857 


1597 


5*3 


KL IGKVLVLS WADAMAAFAVEPQGPALGS E PMMLGS PTSPKPG 
VNAQFLPGFLMGDLPAP VTPQPRS ISGPS VG VMEMRS PLLAGGS 
PPQPWPAHKDKSGAPPVRSIYDDISSPGLGSTPLTSRRQPNIS 
VMQSPLVGVTSTPGTGQSMFSPASIGQPRKTTLSPAQLDPFYTQ 
GDSLTSEDH\LDDSWGDCIWGFLKASA\SYILL\QFAQYGGIS* 
NM WMSNTGNWMH I RYQS KLQAR KALS KDGR I FGE S I M I GVKP C I 
DKSVMESSDRCALSSPSLAFTPPIKTLGTPTOPGSTPRTQTMRP 
LATAYKASTSDYQVISDRQTPKKDESLVSKAMEYMFGW 


5858 


355 


1419 


PPHQPAAASTSXHQQQQPPPPPQDSSKPWAQGPGPAPGVGSAP 
PASSSAPPATPPTSGAPPGSGPGPTPTPPPAVTSAPPGAPPPTP 
PSSGVPTTPPQAGGPPPPPAAVPGPGPGPKQGPGPGGPKGGKMP 
GGPKPGGGPGLSTPGGHPKPPHRGGGEPRGGRQHHPPYHQQHHQ 
GPPPGGPGGRSEEKISGPRRGFKANbSLLRRPGEKTYTQRCRFC 
LLGIYLLISRRMNSRRLFAKIWENQEKFLSTKAKDSEFIKLESR 
ALA*NCPKPELG*YTP*GGRQLPSSLFPTHACLPLSCSVIFSPF 
MFPQ * NCWGRKP FRPNLG PHL KGAVCNRWDD P WEGPTGKGH CLN 
FAS 


5859 


307 


1503 


GGSSARPRASSRRMLSRKKTKNEVSKPAEVQGKYVKKETSPLLR 
NLM P S F I RHG P T I P RRTD I CLPDS S PNAFSTSGDG WS RNQS FL 
RTPIQRTPHEIMRRESNRLSAPSYLARSLADVPREYGSSQSFVT 
EVSFAVENGDSGSRYYYSDNFFDGQRKRPLGDRAHEDYRYYEYN 
HDLFQRMPQNQGRHASGIGRVAATSLGNLTNHGS EDL PLPPGWS 
VDWTMRGRKYYIDHNTNTTHWSHPLEREGLPPGWERVESSEFGT 
YYVDHTNKKAQY\RHPCAPTCTSV*STTSCHI/AS/RQQTERNQ 
SLLVPANPYHTAEI PDWLQVYARAPVKYDHILKWELFQLADLDT 
YQGMLKLL FMKE LE Q I VKM YEAYRQALLTELENRKQRQQW YAQQ 
HGKNF 


5860 


2956 

> 


1270 


TI RVEE FPLCPGGGKAQLSSAS LLGAGLIjLQPPTPPPLLLLLFP 
LLLFSRLCGALAGPI I VEPHVTAVWGKNVSLKCL I E VNET I TQI 
S WEKIHGKSSQTVAVHHPQYGFS VQGE YQGRVLFKNYS LNDAT I 
TLHNIG FS DSGKY I CKAVTFPLGNAQS S TTVTVLVEPT VSL I KG 
PDSLIDGGNETVAAICIAATGKPVAHIDWEGDLGEMESTTTSFP 
NETATIISQYKLFPTRFARGRRITCVVKHPALEKDIRYSFILDI 
QYAPEVSVTGYDGNWFVGRKGVNLKCNADANPPPFKSVWSRLDG 
QWPDGLLASDNTLHFVHPLTFNYSGVYICKVT\NSPGSKEVTQK 
VHPTFQDPSLPTYPPLPALQFQWASPSTA*TSRD\LATEP*KIA 
PSPLSTL\ATIKGWTQLPTIIA*CSGVGALFIV\LVKCFGLGIF 
CYRRRRTFRGDYFAKNYIPPSDMQKESQIDVLQQDELDPYPDSV 
KKENKNPVNNL IRKDYLE EPE KTQWNNVENLNR FER PMD YYEDL 
KMGMKFVSDEHYDENEDDLVSHVDGSVISRREWYV 


5861 


2051 


1305 


EVCACVQAFWLVASSGDDSQGGDKCGCEVGSWVGSMRWMARLL 
S EGEQGI PTACAAFAQQ PAG/ E PRRGLAGVGEGGPQCS WVNYRC 
T LE FLVS LLGTD LARGRGNS ASG PTAPADS KQL / ML * DVHRR VI 
LE * RMNSGS PARDNAPSQRFCTNLS EGLRFG I S PSWREALYGCH 
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residue of 
amino acid 
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Predicted end 
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corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L«Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, Vs Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








A 


5862 


1556 


483 


P P FQ LI MGE I KVS PD YNWFRGTV PLK K 1 1 VDDDDS KI W S L YDAG 
PRS I RCPLIFLP PVSGTADVFFRQ I LALTGWG YRVI ALQ Y P VYW 
DHLEFCDGFRKLLDHLQLDKVHLFGASLGGFLAQKFAEYTHKSP 
RVHSLILCNS FSDTS I FNQTWTANS FWLMPAFMLKKI VLGNFSS 
GPVDPMMADAIDFMVDRLESLGQSELASRLTLNCQNSYVEPHKI 
RD I PVT IMDVFDQSALS TEAKEEMYKLY PNARRAHLKTGGNFP Y 
LCRSAEVNLYVQ I HL/ R / RNS ME PNTRPLTHQWS VPRSLRCRKA 
ALASARRSSS VSLAVNDELTRCVLV* SVASAPVSRPFPSGS SGS 
PVLTVSGK 


5863 


2714 


249 


PFPSRGSLPLAAPREDTMGPLMVLFCLLFLYPGLADSAPSCPQN 
VNISGGTFTLSHGWAPGSLLTYSCPQGLYPSPASRLCKSSGQWQ 
TPGATRSLSKAVCKPVRCPAPVS FENG I YTPRLGS YPVGGNVSF 
ECEDGF I \LRGS PVRQCRPNGMWDGETAVCDNGAGHCPNPG I S L 
GP \ VRTGFRFGHGDKVRYRCS SNLVLTGSSERECQGNGVWSGTE 
PICRQPYSYDFPEDVAPAIiGTSFSHMLGATNP,TQKTKESLGRKI 
Q I QRS GHLNL YLLLDCSQS VS ENDFL I F KESASLMVDR I FS FE I 
NVSVAIITFASEPKVLMSVLNDNSRDMTEVISSLENANYKDHEN 
GTGTNTYAALNSVYLMMNNQMRLLGMETMAW\QEIRHAI ILL\T 
DG K \ S HMGG S P KT AVDH I RE T LN T NOTTRMD Y T.n T T rtVCZYl ,nu 
DW RELNE LGS KKDGERHAF I LQDT KALHQ V FEHM LDVS KLTDT I 
CGVGNMS ANASDQERTPWHVT IKPKSQET \ C\ RGALI SDQWVXT 
AAHCFRDGNDHS L WRVNVGDP KSQWG KE FL IE KAVI S PG FDVFA 
KKNQG I L\EFYGD \ DI ALL \ KLAQKVKM\ STHCQGPSCLP \ CTM 
\EANLGFLRETFKGSTCR\DHENEL/ VWNKQS V\ PAHF\ VAL \N 
GS KLEHLTLRMGVE WTS CCRGLS P KKKTM \ FPNLT \ D VRE \ WT 
D \ QFL \ CS \GPQEDE S P \ CK * E \ SGGA\ VFLERRFRLSAGGVW C 
SWGL\YNP\CLGSA\DKNSPKKGPSVAKVPPPTR/DFHIN\LFP 
Q * S P WLRQHPGGMS * I FL P LLANGHLS P FACP AR I CR P LHFLP S 
EWATLRTL 


5864 


173 


1013 


PLISVPQSLISLPQPLLCFPGGQEPSAPSPCLYSFLWACSFTMG " 
KLPPS I P PSSPLACVLKNLKPLQLTPDLKPKCLI FFCNTAWPQ Y 
KLDNDSK* PENGTF EFS I LQ VLDNS CH KMGKWS E VPDVQAF F \ S 
HWSLPSLCSQC/GLIPNLSSFSPFCSFG/PPPQVPSP/TESFFS 
MDSSDLPPSPQAAPRQAEPGPNSHLASAPPPYNPFITSPPHTWS 
S LQ FHS VTS PP P P AQQ FTLKKVAGAKG I VKVSAP FS LS Q I R * RL 
GSFSSNIKIQPSSWLIWQQP 


5865— 


568 


1684 


CLPGPRWGEGWRAGHTIVGCI FFKTAI ISHFKGGMYLCVCMCTC 
LSVCVCVQVGSWICV/CVSMCACVSLCTC\ICRCISMYTREHAC 
ACraV*VYMCMS/VCTCVSTCIDVRVCAHVCVYMCLCLGYA*AC 
TCV*MCVCMHEHVCMC/VCACSCVLL/CRGHICM/MCMSAYICI 

/cvyvcvlc\a^cmrmstc\™lvyg*actcw^m/csctcr/c 
vhvccmsmhaceclcvylhi cgcagtrrwwagsargsrscsrlp 
cwapgpglslpgpscpsveqglgggpgqlqgrsgearlgehrgw 
gspaavcsrnctvs prrgadcfeapdvpkq ppgwgras feergc 
ggrgw vcapplngpqcccfs i kpe lkakkkk 


5666 


98 


3197 


ARPEVPAP PAWLS RRGAAKMGDKKDDKDS PKKNKGKERRDLDDL 
KKEVAMTEHKMS VEEVCRKYNTDC VQGLTHSKAQB I LARDGPNA 
LTPPPTTPEWVKFCRQLFGGFSILLWIGAILCFLAYGIQAGTED 
DPSGDNLYIX3IVLAAVVIITGCFSYYQEAKSSKIMESFKNMVPQ 
QALVIREGEKMQVNAEEVWGDLVEIKGGDRVPADLRI I SAHGC 
KVDNS SLTGESE PQTRS PDCTHE \NPLKTRNITFFSNNFVEGTA 
RGWVATGDRTVMGRIATLASGLEVGKTPIAIEIEHFIQLITGV 
AVFLGVS FFI LSL I LGYTWLEAVI FLIGI IVANVPEGLLATVTV 
CLTLTAKRMARKNCLVKNLEAVETLGSTSTICSDKTGTLTQNRM 
TVAHMW FDNQ IHBADTTEDQSGTS FDKS S HTWVALF * H / LLG FC 
NRPVFKGGQDNIPVLKRDVAGDASESALLKCIELSSGSVKLMRE 
RNKKVAEIPFNSTNKYQLSIHETEDPNDNRYLLVMKGAPERILD 
RCSTILLQGKEQPLDEEMKEAFQNAYLELGGLGERVLGFCHYYL 
PEEQFPKGFAFDCDDVNFTTDNLCFVGLMSMIGPPRAAVPDAVG 
KCRSAGI KVI MVTGDHP I TAKAI AKGVG 1 1 FEGNETVED IAARL 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine / 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S«Serine, T»Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NI P VSQVN PRDAKACV IHGTDLKDFTS E Q I DE I LQNHTE I VFAR 
TSPQQKLI I VEGCQRQGAI VAVTGDG VNDS PALKKAD I GVAMG I 
AGSDVSKQAADMILLDDNFAS IVTGVEEGRLI FDNLKKSIAYTL 
TSNIPEITPFLLPIMANIPLPLGTITILCIDLGTDMVPAISLAY 
EAAESD I MKRQ PRNPRTDKLVNERL I SMAYGQ IGM I QALGG FFS 
YF V I LAENG FLPGNLVGI RLNWDDRTVNDLED S YGQQWTYEQRK 
WE FTCHTAFFVS I VWQWAD L 1 1 CKTRRNS VFQQGMKNKI L I F 
GLFEETALAAFLSYCPGMDVALRMYPLKPSWWFCAFPYSFLIFV 
YD E I RKL I LRRNPGG W VEKET Y Y 


5867 


3 


1485 


LPGRRARGGRGLGWPPAQALDGSRMGKAKVPASKRAPSSPVAKP 

D FS QNWKALQ EWLLKQ KS QAP E K P LVIS QMG S KKKPK 1 1 QQNKK 
ETSPQVKGEEMPAGKDQEASRGSVPSGSKMDRRAPVPRTKASGT 
EHNKKGTKERTNGDIVPERGDIEHKKRKAK\GQPQPHPPR/IDI 
WFDDVDPAD I EAAIGPEAAKI ARKQLGQS EGSVSLSLVKEQAFG 
GLTRALALDCEMVGVGPKGEESMAAR VS I VNQYGKCVYDKYVKP 
TEPVTDYRTAVSGIRPENLKQGEELEVVQKEVAEMLKGRILVGH 
ALHNDLKVL FLDH P JOCK T RDTO V YK" P PK *3 <") VV<5 n R P Q T ,P T .T . Q tt v 

ILGLQVQQAEHCSIQDAQAAMRLYVMVKKEWESMARDRRPLLTA 
PDHCSDDA+QSCPAAAAAPLQRQCDQSQGQITSPQSGNSGETFS 
ESWQRGVAWCY 


5868 


2122 


833 


LTAGASHTQDASQSTS AKYPAAAQNL / CVTNAMREDLADI WYIR 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TE RSAFTERD AG S GL VTRLRER P ALL VS STS WTEDEDFS I LLAA 
LESRV*T\MTLDGHNLPSLVC!VTTGKGPIiREYVqPTiTKOT?'VTPr>U 
I QVCT P WLE AED Y PLLLGS ADLG VCLHTS S SGLDL PMKWDM FG 
CCLPVGAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5869 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL / CVTNAMREDLADI WY IR 
AVT VYD KPAS FFKETPLDLQHRL FM KLGSMHS P FRARS E PEDP V 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSILLAA . 

LES R V* T \ MTLDGHNLPS LVd V T TdKCZ P T ,R R VV Q P. T , T WO TTWPn U 

IQVCTPWLEAEDYPIxI^SADl^VCIxHTSSSGLDLPMKVVDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5870 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/ CVTNAMREDLADlWYIR 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAF TERDAGSGLVTRLRER PALLVS STS WTEDED FS I LLAA 
LESRV*T\MTLDGHNLPSLVCVITGKGPLREYYSRLIHQKHFQH 
IQVCTP WLEAEDYPLLLG S ADLG VCLHTS S SGLDLPMKWDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVIiPLVMDT 


5871 


3 


3465 


FFFCRPLRLYSKTTGDRSAMAGAAGLTAEVSWKVLERRARTKRS 
VLKLL* LS LRRL * LE PT I * NGLLT* CS RLS VFRFLKV\ GS VYE P 
LKS INLPRPDNETLWDKLDHYYRI VKSTLLLYQS PTTGLFPTKT 
CGGDQKAKIQDSLYCAAGAWALALAYRRIDDDKGRTHELEHSAI 
KCMRGILYCYMRQADKVQQFKQDPRPTTCLHSVFNVHTGDELLS 
YE E YGHLQ I NAVS L YLL YLVEM I S SGLQ 1 1 YNTDE VS F I QNLVF 
CV\ERVYRVP\DFG\VWGKREGKYY*/SGSTELHSSSVGLGKRQ 
L*KQFNGFNLFGNQGCSWSVIFVDLDAHNRNRQTLCSLLPRESR 
SHNTDAALLPCISYPAFALDDEVLFSQTLDKVWKLKGKYGFKR 
FLRDGYRTSLEDPNRCYYKPAE IKLFDG IECEFP I FFLYMMIDG 
VFRGNPKQVQEYQDLLTPVLHHTTEGYPWPKYYYVPADFVEYE 
KNNPGSQKRFPSNCGRDGKLFLWGQALYI I AKLLADELISPKDI 
DPVQRYVPLKDQROTSMRFSNQX5PLENDLVVHVALIAESQRLQV 
FLNTYG I QTQTPQQ VE P IQ I WPQQ ELVKA YLQLG I NE KLGLSGR 
PDRPIGCLGTSKIYRILGKTWCYPIIFDLSDFYMSQDVFLLID 
DIKNAU3FIKQYWKMHGRPLFLVLIREDNIRGSRFNPILDMLAA 
LKKGIIGGVKVHVDRLQTLISGAWEQLDFLRISDTEELPEFKS 
FEELEPPKHSKVKRQSSTPSAPEI^QQPDVNISEWKDKPTHEIL 
QKLNDCS CLAS QAI LLG I LLKREGPNF I TKEGT VSDH I ERVYRR 
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NO: 
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beginning 
nucleotide ' 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepticTe - 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AGSQKLWS WRRAASLLS KWDS LAP S I TNVL VQGKQVT LGAFG 
HEEEVISNPLSPRVIQNIIYYKCNTHDEREAVIQQELVIHIGWI 
ISNNPELFSGTLKIRIGWIIHAMEYELQIRGGDKPALDLYQLSP 
S E VKQLLLD I LQ PQQNGRCWLNRRQ I DGS LNRTPTG FYDR VWQ I 
LERTPNGIIVAGKHLPQQPTLSDMTMYEMNPSLLVEDTLGNIDQ 
PQYRQ I WBLLMWS I VLERNPELE FQDKVDLDRLVKEAFNE FQ 
KDQSRLKEIEKQDDMTSFYNTPPLGKRGTCSYLrKAVMNLLLEG 
EVKPNNDDPCL I S 


5872 


68 


665 


VQGYMYRFVIKINSCYSEKTSICRHRCCPELPATQPWPTPTVFF 
NIAIDSESLGCI\SFKLFADKV/PKRWKKNFVLLNTGEKVLGDK 
GPCFYRIIPG\LCQGGDFTHHNGTGGKSLYSKEFDDENFI/LKH 
TAPGVLSTANAG PTTNG S Q FF I CTAKTEDG * QHWFGKVKDGMS 
IVEALERSGSRNGKTSKKITAANCGQL 


5873 


2240 


506 


RRPPEGGSGGGRRTRARMPLPWSLALPLLLSWVAGGFGNAASAR 
HHGLLASARQPGVCH YGTKLACC YG WRRNSKGVCEATCE PG CKF 
GEC VG PN KCRC F PG YTG KTCSQDVNE CGMKPRP CQHRC VNTHGS 
YKCFCLSGHMLMPDATCVNSRTCAMINCQYSCEDTEEGPQCIjCP 
SSGLRLAPNGRDCLDIDECASGKVICPYNRRCVNTFGSYYCKCH 
IGFELQYISGRYDCIDINECTMDSHTCSHHANCFNTQGSFKCKC 
KQGYKGNGLRCSAI PENSVKEVLRAPGTIKDRIKKLLAHKNSMK 
KKAKI KNVTPE PTRTPTP KVNLQP FNYEEI VSRGGNSHGG \ KKG 
NEEKMKEGLEDEKREEKALKD*HRRERPFRG\DVFFPKVNEAGE 
FGLIL\ VQRKALTS KLE^KADLNI SVDCS FNHG \ I CDW \ KQDR\ 
EDDFDW\NPADR\DNAI\GFY\MAVPGLWQGHK\KDIGRLKLLL 
PDLQPQSNFCLLFDYRLAGDKVGKLRVFVKNSNNALAWEKTTSE 
DEKWKTGKIQLYQGTDATKS I IFEAERGKGKTGEIAVDGVLLVS 
GLCPDS LLS VDD 


5874 


2 


3387 


ACPRLARRRRRVRSLRRRRGWLRARWSRGQNNMAARRITQETFD 
AVLQEKAKRYHMDASGEAVSETLQFKAQDLLRAVPRSRAEMYDD 
VHSDGRYSLSGSVAHSRDAGRESLRSDVFSGPSFRSSNPSISDD 
S YFRKE CGRDLE FSHS NS RDQ VIGHRKLGHFRS QDW KFALRGS W 
EQDFGHPVSQESSWSQEYSFGPSAVLGDFGSSRLIEKECLEKE\ 
S RDYD VDHSG \ EA\ DS VLRGS \ SQVQA \ RGRALN I VDQEG S L LG 
. KGETQGLLTAKG G VGKLVTLRNVSTKKI PTVNR I TP KTQGTNQ I 
QKNTPS PDVTLGTNPGTED IQFPIQKI PLGLDLKNLRLPRRKMS 
FDI IDKSDVFSRFGI E 1 1 KVJAGFHTI KDDI KFSQLFQTLFBLE T 
ETCAKMLAS FKCS LKPEHRDFCFFT I KFLKHSALKTPRVDNE FL 
NMLLDKGAVKTKNCFFE 1 1 KP FDKY I MRLQDRLLKS VTPLLMAC 
NAYELSVKMKTLSNPLDLALALETTNSLCRKSLALLGQTFSLAS 
S FRQEKI L * AVGLQD I APS PAAFPNFEDSTLFGREY IDHLKAWL 
VS SGCPLQVKKAEPE PMREEE KMI PPTKPE IQAKAPS SLSDAVP 
QRADHRWGTIDQLVKRVIEGSLSPKERTLLKEDPAYWFLSDEN 
SLEYKYYKLKXiAEMQRMSENLRGADQKPTSADCAVRAMLYSRAV 
RNLKKKLL P \ WQRRGLLRAQG \ LRG\ WKARRA\TTGTQTLLFLR 
APGLKHHGRQAPGLS\QAKPSLPDRND\AAKD\CPLDPV\GPSP 
QDPSLEASGPSPKPAGVDISEAPQTSSPCPSADIDMKDNGRTAE 
KLARFVAQVG\PEIEQF\SI\ENSTDNPDLWFL\HDQNSS\AFK 
FY\RKKVFELCPSICFTSSPHNL\HTGGGDTT\GSQESPVDLME 
GEAEFEDEPPPREAELESPEVMPEEEDEDDEDGGEEAPA\PGRG 
GPSLEGSTPADGLPGEA\AEDDL/ALGAPALFTGLIiQVTCFPFG 
RGFS S KS LKVGMI PAP KR VCL I QE P KVHE P VR I A YDRPRGRPMS 

kkkkpkdldfaqqkl\tdk\nlgfq\mlqkt4gwkeghglgslgk 

GIR\SRSACTQQAAWGGSGWGLSPSTCSLPLGSFTAKMAYSWQL 
IFVF 


5875 


296 


1846 


LAALGGLPLWRLSRRGFREYLLGLSAPSALGGAMRSVSYVQRVA 
LE FSGS LFPHAI CLGD VDNDTLNEL WGDTSGKVS VYKNDDS RP 
WLTCSCQGMLTCVGVGDVCNKGKNLLVAVSAEGWFHLFDLTPAK 
VLDAS GHHE TL I GEEQRP VFKQH I P ANTKVML I S DIDGDGCREL 
WGYTDRWRAFRWEELGEGPBHLTGQLVSLKKWMLEGQVDSLS 
VTLGPLGLP ELMVSQPGCAYA I LLCTWKKDTGS PPAS EGPTDGS 
/ S GD P S CPRRGAAPD I WP YPQQE CLHS PNWQHQT\SHGTES S GS 
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to first 
amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V~Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=*possible nucleotide insertion) 








GLFALCTLDGTLKLMEEMEEADKLLWSVQVDHQLFALEKLDVTG 
NGHEEWACANDGQT YI I DHNRT WRFQVDEN IRAFCAGLYACK 
EGRNSPCLVYVTFNQKIYVYWEVQLERMESTNLVKLLETKP\ST 
TACCRSWAWILTTSL*LVPCFTKRSTIQTSHHSVLPQASRIPPS 
WTCLIAGEGFF*TPTLPPKGVFGSHCAAAGSITKQ 


5876 


1122 


224 


HLPLGVPSKVAGAAAMEPQEERETQVAAWIjKKIFGDHP I PQYEV 
NPRTTEILHHLSERNRVRDRDVYLVIEDLKQKASEYESEAKYLQ 
DLLMES VNFS PANLSS TGS R YLNALVDSAVALETKDTSLASPI P 
AVNDLTSDLFRT ECS KS EE I KI ELE KLEKNLTATLVLEKCLQED V 
KKAELHLSTER\AKVDNRRQNM\DFLKAKSEEFRFGIQAAGEQL 
SARGQ \DAFS VP IQSLVAL IRENW PRLKQQT I PLK\ KKLESYLD 
LMP\NPSHCSK*RIEEAK\RELA\SIEAELTRRVS\MMEL 


5877 


2030 


1907 


GTLGKMAASSSGEKEKERLGGGLGVAGGNSTRERLLSALEDLEV 
LS RELI EMLAI S RNQ KLLQAGEENQVLELLIHRDGE FQELM KLA 
LNQGKIHHEMQVLEKEVEKRDSDIQQLQKQLKEAEQILATAVYQ 
AKE KL KS I E KAR KGA I S S EE 1 1 KYAHR I S ASNAVCAPLT WVPGD 
PRRPYPTDLEMRSGLLGQMNNPSTNGVNGHLPGDALA/RRKIAR 
CPCSTVS/NGSQMTCR+INIILILQKSVCEL 


5878 


950 


2113 


GLWKCM QLQGP HTHR VQP * PTPRQQGPQ \ VPVAVI AGNRPNYLY ' 
RMLRSLLSAQGVS PQMI TVFIDG YYEEPMDWALFGLRGI QHTP 
I S I KNAR VSQH YKASLTATFNLF P E AKFAWLE E DLD I AVD FFS 
FLSQS IHLLEEDDSLYCI SAWNDQGYEHTAEDPALLYRVETMPG 
LGWVXiRRSLYKEELEPKWPTPEKLWDWDMWMRMPEQRRGRECII 
PDVSRSYHFGIVGLNMNGYFHEAYFKKHKFNTVPGVQLRNVDSL 
KKEAYEVEVHRLLSEAEVLDHSKNPCEDSFLPDTEGHTYVAFIR 
MEKDDDFTTWTQLAKCLHIWDLDVRGNHRGLWRLFRKKNHFLVV 
GVPASPYSVKKPPSVTPIFLEPPPKEEGAPGAPEQT 


5879 


3 


981 


RLTEAAAAGSGSRAAGWAGS PPTLLPLS PTS PRCAATMASSDED 
GTNGGAS EAGEDRE APG KRRRLG FLATAWLT F YD I AMTAG W L VL 
AIAMVRFYMEKGTHRGLYKSIQKTLKFFQTFALLEIVHCLIGIV 
PTSVIVTGVQVSSRIFMVWLITHSIKPIQNEESWLFLVAWTVT 
E I TR YS F YTFSLLDHLP Y F I KWAR YNFFI I LYP VG VAGELLT I Y 
AALPHVKKTGMFSIRLPNKYNVSFDYYYFLLITMASYIPLFPQL 
YFHMLRQRRKVLHG\G*L*KRMIK*SLQTRCFFQNNQDYLSPSF 
NNKNKQLCEISWIVWFLKI 


5880 


1138 


1324 


S LW CL VAGGLGLG PS SQNPLQRAG I LAR PREARGTFSALTACSA 
SVTSKGKSSSGMWPSAASDRDSPVPLRPPGPVQLPSGTGWVLSD 
♦KKKRGRCSS/WLSQPQHEREKEWLLRRSMAEGERARAASDVL 
CRSLANETHQLRRTLTATAHMCQHLAKCLDERQHAQRNVGERSP 
DQS EHTDGHTS VQS V I E KLQE ENRLLKQKVTHVE D LNAKWQRYN 
AS RDE YVRGLHAQLRGLQ I PHE PELMRKEI SRLNRQLEEK INDC 
AEVKQELAASRTARDAALERVQMLEQQI LAYKDDFMSERADRER 
AQS R I QELE EKVAS LLHQ VS WRQDS RE P DAGR I HAGS KTAKYLA 
ADALELMVPGGWRPGTGSQQPEPPAEGGHPGAAQRGQGDLQCPH 
CLQ C FS DEQGEE LLRHVAE CCQ 


5881 


26 


441 


GGIHPSPTEAPRAQHLTMDtTWRILFLVAAATGTHAQVQLLQSG 
S E VKKPGAS VMVS C YVSGYTLTKLSMHWVRQAPGKGLE *MGPFD 
LQDVETIYPQKFQGRVSMTEETSTETTQ/AYLELSSLRSEDTAV 
HHCATDTV 


5882 


2407 


2216 

i 


SGCVEMLYSHSLEYNPEWISVQSAVAPAQLALNSDGDL*LHSGE 
RTRRD*QLPEAGGPGLQEPLQLGELDITSDEFILDEVDG\VDLR 
HYSKQVELELQQIEQKSIRDYIQESENIASLHNQITACDAVLER 
MEQMLGAFQSDLSSISSEIRTLQEQSGAMNIRLRNRQAVRGKLG 
ELVDGL WPSALVTAI LEAPVTE PRFLEQLQELDAKAAAVREQE 
ARGTAACAD VRGVLDRL R VKAVTKI RE F I LQKI Y S FR KPMTNYQ 
IPQTALLKYRFFYQFLLGNERATAKEIRDEYVETLSKIYLSYYR 
S YLGRLM KVQ YEEVAEKDDLMGVE DTAKKG FFSK PS LRS RNT I F 
TLGTRGS VIS PTELEAP I LVPHTAQRGEQRYPFEALFRSQHYAL 
LDNS CRE YL F I CEF FWSGPAAHDLFHAVMGRTLS MTLKHLD S Y 
LADCYDAIAVFLCIHIVLRFRNIAAKRDVPALDRYWEQVLALLW 
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SEQ 

\ ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D^Aspartic Acid, B= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PRFE L I LEMNVQS VRSTD PQRLGGLDTRPH Y I TRR YAE FS S ALV 
S INQTI PNERTMQLLGQLQVE VENFVLRVAAE FSSRKEQLVFLI 
NNYDMMLG VLM \E * ERAADDS KEVES FQQLLNARTQEF I EELLS 
P PFGGLVAFVKE AEAL I ERGQAERLRGE E AR VTQL I RG FG S S WK 
SSVESLSQDVMRSFTNFRNGTSIIQGALTQLIQ\LYHRFHRV\li 
SQPQLRAIiPARAELINIHHLMVELKKHKPNF 


5883 


2 


1374 


E FPGRRFRAVME AGAGAGAGAAGWS CPG PGPTVTTLGS YEAS EG ' 

CERKKGQRWGSLERRGMQAMEGEVLLPALYEEEEEEEEEEEEVE 

EEEEQVQKGGSVGSLSVNKHRGLSLTETELEELRAQVLQLVAEL 

EETRELAGQHEDDSLBLQGLLEDERLASAQQAEVFTKQIQQLQG 

ELRSLREEISLLEHEKESELKEIEQELHLAQAEIQSLRQAAEDS 

ATEHE SD I AS LQEDL CRMQNELE DM ER IRGDYEME I ASLRAE ME 

MKSSEPSGSLGLSDYSGLQEELQELRERYHFLNEEYRALQESNS 

S LTGQLADLE S ERTQRATE RWLQS QTLS MTS AES QTS EMD FLE P 

DPEMQLLRQQLRDAEEQMHGMKNKCQELCCELEBLQHHRQVSEE 

EQRRLQRE LKCAQNE VLRFQTSHS \ S P SHPLPP I P PS S PCLIi * A 

LWISALLWCWWAETSS 


5884 


4261 


2522 


G VLARAS ARLR VP LTGVRACAE PE VGAE PAKVAGAAE PDEDGGR 
SRLRDCGDYTPSERLGPKGAMLWFQGAIPAAIATAKRSGAVFW 
FVAGDDEQSTQMAAS WEDDKVTEASSNS FVAI KIDTKSEACLQF 
SQI YPWCVPSSFFIGDSGI PLEVTAGSV 5 ; AnPT.VTP Tum/criM 

HLLKSETSVANGSQSESSVSTPSASFEPNNTCENSQSRNAELCE 
IPSTSDTJCSDTATGGESAGHATSSQEPSGCSDQRPAEDLNIRVE 
RLTKKLE E RRE E KRKEE EQRE I KKE I ERRKTGKEMLD Y KRKQE E 
ELTKRMLEERNREKAEDRAARER I KQQIALDRAERAARFAKTKE 
EVEAAKAAALLAKQAEMEVKRESYARERSTVARIQFRLPDGSSF 
TNQFP SDAP LEE ARQ FAAQTVGNT YGN FS LATMFPRREFTKED Y 
KKKLLDLELAPSASWLLP/ALFINF*AGRPTASIVHSSSGDIW 
TLLGTVLYPFLAIWRLISNFLFSNPPPTQTSVRVTSSEPPNPAS 
S S K S E KRE P VRKRVL B KRGDDF KKEG K I YRLRTQDDGEDENNTW 
NGNSTQQM 


5885 


900 


467 


aagggrrsrlsrswptgpskspsgvrccg\rr\awedkdefldv 
iywfrqiiawlgviwgvlplrgflgiagfclinagvlylyfsn 

YLQ I DEEEYGGTWELTKEGFMTS FA/ 1 VHGHLDHLLHCHPL* LM 
VYSSQVLPIQSKGPS 


5886 


86 


1341 


PFRGRALTLKKQPRPGVAPPSLGTCHKSDPGRPAAQSQPPSPGS 
GTFGLLSFRMVRTKTWTLKKHFVGYPTNSDFELKTSELPPLKNG 
EVLLEALFLTVT)PY^VAAKRLKSGDTMMGQQVAKVVESKNVAL 
PKGTIVLASPGWTTHSISDGKDLEKLLTEWPDTIPLSIiALGTVG 
MPGLTAYFGLLE I CGVKGGETVMVNAAAGAVGS WGQIAKLKGC 
KWGAVGSDEKVAYLQKIiGFDWFNYKTVESLEETLKKASPDGY 
DCYFDNVGGEFSNTVIGQMKKFGRIAICGAISTYNRTGPLPPGP 
PPEIGIYQELRMEAFWYRWQGDARQKAIiKDLLKWVLELPYFVI 
D* LQANTLVYKSMKSAKPS LE YISEKLVSG\KI QYKE YI I EGFE 
NMPAAFMGMLKGDNLGKT I VKA 


5887 


193 7 


104 


APGCRG CRATRCP CRG PR WDS LGDEAARS PAAPGGAPGLLGLRE 
RPDRCHPGGDDRGPQLHRGSPG/SPSELSRRPGPPGLPGLQGPP 
PAPGLPQSRTL/PVLCVCDLSPAQCDINCCCDPDCSSVDFSVFS 
ACSVPWTGDSQFCSQKAVIYSLNFTANPPQRVFELVDQINPSI 
FCIHITN\*NLHYPLIiIQKYL/NENNFDTLMKTSDGFTLNAESY 
VSFTTKLDIPTAAKYEYGVPLQTSDSFLRFPSSLTSSLCTDNNP 
AAFL VNQAVKCTRKINLEQCE E I E ALS MAFYSS PE I LRVPDS R K 
KVP I TVQS I VI QS LN KTLTRREDTD VLQ P TL VNAGHFS LC VNW 
LEVKYSLTYTDAGEVTKADLSFVLGTVSSVVVPLQQKFEIHFLQ 
ENTQPVPLSGNPGYVVGLPLAAGFQPHKGSGIIQTTNRYGQLTI 
LHS TTEQ D CLALEG VRTP VL FGYTMQSG CKLRLTGAL PCQLVAQ 
KVKS LLWGQG F PD YVAP FGNS QG P / ADMLD WVP I HF I TQ S FNRK 
DS CQLPGALVI EVKWTKYGSLLNPQAKI VNVTANLISS SFPEAN 
SGNERTILI S TAVTFVDVSAPAEAG FRAPPAINARLPFNFFFPF 
V 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, Glycine, 
H=Histidine, I=Isoleucine, K=Lyeine, 
L=Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q-Glut amine, R=Arginine, 
S -Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, ' 
\=possible nucleotide insertion) 

LLCRTPGVAMQRADSEQPSKRPRCDDSPRTPSNTPSAEADWSPG 
LE LHPDYKT WG P EQ VCS FLRRGG FE E P VL LKN I RENE I TG ALLP 
CLDESRFENLGVSSLGERKKLLSYIQRLVQIHVDTMKVINDPIH 
GHI ELHPLLVR 1 1 DTPQ FQRLR Y I KQLGGGYYVFPG AS HNRFEH 
S LGVG YLAG CL VHALG E KQ PE LQ I S ERD VLC VQ I AGLCHDLGHG 
PFSHMFDGRFIPLARPEVKWTHEQGSVMMFEHLINSNGIKPVME 
Q YGL IPEEDICFIKEQIVG PLES P VEDS LWP YKGRPEN KS F LYE 
IVSNKRNGIDVDKWDYFARDCHHLGIQNNFDYKRFIKFARVCEV 
DNELRICARDKEVGNLYDMFHTRNSLHRRAYQHKVGNI IDTMIT 
DAFLKADD Y I E I TGAGGKK YR IS TA IDDMEA YTKLTDNI FL EI L 
YSTDPKLKDAREILKQIEYRNLFKYVGETQPTGQIKIKREDYES 
LPKEVASAKPKVLLDVKLKAEDFIVDVINMDYGMQEKNPIDHVS 
F YCKTAPNRAIR I TKNQVSQLLP \E KFAEQ\ LIRVYCKKVDRKS 
LYA\ARQYFVQW\CADR\NFT\KPQDGRCY*PPTP*HPQKKGW\ 
NDSTFSPKI PTRLPRRLPKSRV\QLFKDDPM 

LPAACGRPVTARPRQAPEGRSGRPRDLDPYPPQVFPPRPDRVAI 
VTGGTDGIGYSTAKHLARLGMHVIIAGNNDSKAKQWSKIKEET 
LNDKET*VLLCCPGWLCLWNSSDPPTSASRGAGTTGVHHHFLLK 
FG I F I L \ DLASMTS I RQ FVQ KFKMKK I P LHVL I NNAGVMMVPQR 
KTRDGFEEHFGLNYLGHFLLTNLLLDTLKESGSPGHSARWTVS 
SATHYVAELITODIjQSSACYSPHAAYAQSKLALVLFTYHLQRLL 
AAEGSHVTANWDPGVVNTDLYKHVFWATRLAKKLLGWLLFKTP 
DEGAWTSIYAAVTPELEGVGGRYLYNKKETKSLHVTYNQKLQQQ 
LWS KSCEMTGVLDVTL 



SEQ 
ID 
NO: 



5888 



375 



5889 



1831 



1322" 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



2302 



-13T 



5890 



200 



5891 



1322 



FRRGWSAAGRAVPVAFCSRISASSPRRPRGAVRLQSGTEAACRS 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAILTCP 
LE WKTRLQS S S VTLY 1 3 E VQ LNTMAGAS VNR WS PGPLHCLKV 
ILEKEGPRSLFRGLGPNLVGVAPSRAIYFAAYSNCKEKLNDVFD 
PDSTQVHMISAAMAGFTAI TATNP I WL I KTRLQL * / SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVIYESI 
KQKLLEYKTASTMENDEESVKEASDFVGMMLAAATSK\LVATTI 
AYPHEWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRQ I P \NTAIMMATYELWYLLNG 



200 



5892 



1764 



FRRGWS AAGRAVP VAFCS R I S ASS PRRP RGAVRLQ SGTEAACRS 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAILTCP 
LEWKTRLQSSS VTLYISEVQLNTMAGAS VNRWS PGPLHCLKV 
ILEKEGPRSLFRGLGPNLVGVAPSRAIYFAAYSNCKEKLNDVFD 
PDSTQVHM I S AAMAGFTA I TATNP I WLI KTRLQL * / S QGTAG KR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVIYESI 
KQ KLLE YKTAS TMENDEE S VKEAS DF VGMMLAAATS K\ LVATT I 
AYPHEWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRQIP \NTAIMMATYELWYLliNG 



379 



5893 



WLRVCGRLS VNS AVS S RTGG WS AGLTCAMQRLQWLGHLRGPA - 
DS GWM PQ AAP CL S GAPHASAADWWHG RRTAI CRAGRGGFKDT 
TPDELLSAVMTAVLKDVNLRPEQLGDICVGNVLQPGAGAIMARI 
AQFLSD I PETVPLSTVNRQCSSGLQAVAS IAGGIRNGS YDIGMA 
CGVESMS LADRGNPGNITSRLMEKEKARDCL I PMG I TS ENVAER 
FGISREKQDTFALASQQKAARAQSKGCFQAEIVPVTTTVHDDKG 
TKRSITVTQDEGIRPSTTMEGLAKLKPAFKKDGSTTAGNSSQVS 
DGAAA I L LARRS KAEELG LP I LGVLRS YAWGVP PD I MG I GPAY 
AI P VALQKAGLTVSDVD I FEINE \AFASQAAYCVEKLRLPP * EG 
* TP LGGASGP *GHPLGLHWGHVQVITLAQ* S * SARGKRAYRSGC 
PCAIGSWNGSPLPVFEYPWGT 



1653 



I LS KRRCQKAKTKE LMAKKVAV IGAGVSGL I S LKCC VDEGL E P T 
CFERTEDIGGVWRFKENVEDGRASIYQSWTNTSKEMSCFSDFP 
M P EDFPNFLHNS KLLE Y FRI FAKKFDLLKY IQFQTT VLS VR KCP 
D FS SSGQ W KWTQSNGKEQ S AVFDAVMVCS GHHI LPH I PL KS F P 
GMERFKGQYFHSRQYKHPDGFEGKRILVIGMGNLGSDIAVELSK 
NAAQVFISTRHGTWVMSRISEDGYPWDSVFHTRFRSMLRNVLPR 
TAVKWMI EQQMNRW FNHENYGLE PQNKYI MKEPVLNDDVPSRLL 
CGAI KVKSTVKELTETS AI FEDGTVEENI DV1 1 FATGYSFS FPF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, Inisoleucine, K=Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R^Arginine, 
S=Serine, T^Threonine, V=Valine, 
W-Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEDSLVKVENNMVSLYKYIFPAHLDKSTLACIGLIQPLGSIFPT 
AELQARW VTR VFKGLCS LPS ERTMMMD 1 1 KRNEKR I DLFGE S QS 
QTLQTNYVDYLDELALEIGAKPDFCSLLFKDPKIAVRLYFGPCN 
SY*YRLVGPGQWEGARNAIFTQKQRILKPLKTRALKDSSNFSVS 
FLLKILGLLAWVAFF\CQLQWS 


5894 


174 


1673 


R YS P KKVLQNKESS L KLGMATALVS AH S LAPLNLKKEGLRWRE 
DHYS TWEQGFKLQGNS KGLGQE PLCKQ F RQLRYEE TTG PREAL S 

rlrelcqqwlqpethtkehilellvleqfliilpkelqarvqeh 
hpesredwwledlqldlgetgqqvdpdqpkkqkilveemapl 
kgvqeqqvrhecevtkpekekgeetriengkliwtdscgrvbs 
sgkisepmeahnegsnlerhqakpkekieykcsereqrfiqhld 
liehasthtgkklcesdvcqsssltghkkvls*erkviqc\hgv 
lgkafqrsshlvrhqkihlgekpyqotecgkvfsqnagllehlr 
ihtgekpylcihcgknfrrsshlnrhqrihsqeepceckecxskt 
fsqalllthhqrihshskshqcnecgkafsltsdlirhhrihtg 
ekp fkcni cqkafrlnshlaqhyri hneekp yqcsecgeafrqr 
sglfqhqryhhkdkla j 


5895 


2967 


86 


hpsllgaipfypppsspwppplylfwnshrksrhfinqrgihge 
mrlfvsdgvpgclpvlaaagrargraevlistvgpedcvvpflt 
rpkvpvlqldsgnylfstsaicryff\llsgweqddltnqwlew 
eatelqptlsaalyyl\vvqgkkg\edvlgsvrrtlthidhsls 
rq \nc p flage teslad i vlwgalypllqd payl pee l s alhs w 
fqtlstq\epcqr\aarrlvlkq\qgvlalr\pylqkqpqpspa 
egkglsp i e pe eeelatlseee iamavtawe kgleslp plrpq q 
npvlpvagernvlitsalpyvnnvphlgni igcvlsadvfarys 
rlrqwntlylcgtde ygtatetkal \eegltpqe i cdkyh i iha 
d i y\ rwfn isfdi fgrtttpqq\t ki t\qd i fqqllkrgfvlqd 
tveqlrcehcarf\ladrfvegvcpfcgyeeargdqcdkcgkli 
navelkkpqckvcrscpvvqssqhlfldlpklekrleewlgrtl 

PGS DWT PNAO F I TP FFG FREW PS KP RWD * TP TIT . K \ WfiN DfTPD * P 

gfedk\ vfyvwfdat IG YLS ITANYTDQWERWW \ KNPEQVDLYQ 
FM\AKDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 
LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRSILTIS\RH 
GNQ YI \QVNEPW\KR I KGS E ADRQRAGTVTGLAVNI AALLS VML 
QPYMPTVSATIQAQLQLP PPACS I LLTNFLCTLPAGHQIGTVSP 
LFQKLENDQIESLRQRFGGGQAKTSPKPAVVETVTTAKPQQIQA 
LMDE VTKQGN 1 VRELKAQ KADKNE VAAE VAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5896 


2967 


86 


HPS LLGAI PF YP PPSSPW P P PLYLFWNSHRKSRHF INQRG IHGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCVVPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATEIiQPTLSAALYYL\VVQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ\NCPFLAGETESLADIVLWGALYPLLQDPAYLPEELSALHSW 
FQTLSTQ \ E P CQR\ AARRL VLKQ\ QG VLALR \PYLQKQPQPSPA 
SGKGL S PIEPEEEELATLS EEE I AMA VTAWEKGLES LPPLR PQQ 
NPVLPVAGERNVLITSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDE YGTATETKAL \EEGLTPQE I CDKYH I I HA 
DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKLI 
NAVEL KKPQC KVCRS CP WQS S QHL FLDLPKLE KRL E EWLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
G FEDK \ VFYVW FD ATIG YLS I TANYTDQ WERW W \ KNPEQVDLYQ 
FM\AKDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 
LTPDDQRLLA\HVTLELQHYHQ\ LLEKVR I RDALRS I LT I S \ RH 
GNQYI \QVNEPW\KRIKGSEADRQRAGTVTGLAVNIAALLSVML 
Q P YMPTVS AT I QAQLQLP P P ACS I LLTNF LCTLPAGHQ I GT VS P 
LFQKLENDQIESLRQRFGGGQAKTSPKPAWETVTTAKPQQIQA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine f 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q~Glut amine, R^Arginine, 
S=Serine, T«Threonine, V»Valine, 
W=Tryptophan, YoTyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMDEVTKQGNIVRELKAQKADKNEVAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5897 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCWPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATELQPTLSAALYYL\VVQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ\NCPFLAGETESLADIVLWGALYPLLQDPAYLPEELSALHSW 
FQTLSTQ\EPCQR\AARRLVLKQ\QGVIiALR\PYLQKQPQPSPA 
EGKGLSPIEPEEEELATLSEEEIAMAVTAWEKGLESLPPLRPQQ 
NPVLPVAGERNVLITSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTL YL CGTDE YGTATETKAL\ EBGLT PQE I CD K YHI I HA 
D I Y \ RWFN I S FD I FGRTTTPQQ \ TKI T \ QD I FQQLLKRGFVLQD 
TVEQLRC EHCAR F \ LADRFVBG VCP FCG YE EARGDQ CDKCGKL I 
NAVELKKPQCKVCRSCPVVQSSQHLFLDLPKLEKRLEEWLGRTL , 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
G FEDK\ VF Y VWFDAT IGYLS I TANYTDQWER WW \KNP EQVDLYQ 
FM\AKDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 
K \ FS KSRGVG VFRDM \ AHDTG I P PD I SRF YL \ LY I R P EGK\DS A 
FS WTDLLLKNNS \ ELLNNLGN F I NRA\GMFVS KF FGG \ YVPEMV 
LT PDDQRLLA\HVTLELQHYHQ \ LLE KVRI RDALiRS I LT I S \ RH 
GNQYI \QVNE PW \ KRI KGSEADRQRAGTVTGLAVN IAALLSVML 
QPYMPTVSATIQAQLQLPPPACSILLTNFLCTLPAGHQIGTVSP 
LFQKLENDQI ESLRQRFGGGQAKTS PKPAWETVTTAKPQQIQA 
LMDE VT KQGN I VR E LKAQKAD KNE VAABVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5898 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCVVPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
E ATE LQ PTLS AAL YYL \ WQG KKG \ ED VLG S VRRTLTH I DHS LS 
RQ\NCPFIiAGETESLADIVLWGALYPLLQDPAYLPEELSALiHSW 
FQTLSTQ\EPCQR\AARRLVLKQ\QGVLALR\PYLQKQPQPSPA 
EGKGLSPIEPEEEELATLSEEEIAMAVTAWEKGLESLPPLRPQQ 
NPVLPVAGERNVLITSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHIIHA 
DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDI FQQLLKRGFVLQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKLI 
NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK\ VFYVW FDAT I G YLS I TAN YTDQ WERWW \ KNP EQVDLYQ 
FM\ AKDNVP FHS LVFPS S ALGAEDN YTL \ VSHL I ATE YLN YEDG 
K\ FS KSRGVG VFRDM \ AHDTG I PPD I SRFYL\ L YIRPEGK\DSA 
FS WTDLLLKNNS \E LLNNLGN F I NRA\GMFVS KFFGG \ YVP EMV 
LTPDDQRLLA\HVTLELQH YHQ \ LLE KVR I RDALRS I LT I S \RH 
GNQYI \QVNEPW\KRIKGSEADRQRAGTVTGLAVNIAALLSVML 
Q P YMPTVS AT I QAQLQL P P PACS I LLTNFL CTL P AGHQ I GT VS P 
LFQKLENDQI ESLRQRFGGGQAKTS PKPAWETVTTAKPQQIQA 
LMDE VTKQGN I VRELKAQKAD KNE VAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5899 


326 


1078 


NCPKSKEPNGVRAPSLPSPLRAAMALSDVDVKKQIKHMMAFIEQ 
EANEKAEEIDAKAEEEFNIEKGRLVQTQRLKIMEYYEKKEKQIE 
QQKKI LMS TMRNQ ARLKVLRARNDL I S DLLS EAKLRLS R I VED P 
EVYQGLLDKLVLQGLLRLLEPVMIVRCRP\QDLLLVEAAVQKAI 
P E YMT I S QKHVE V\ Q I DKEA* LAVECS WE VWE VYSGNQR I KVSN 
TLESRLDLSAKQKMPEIRMALFGANTNRKFFI 


5900 


64 


1409 


KAASRDSPCLEFCPLCGVSSHDLQHRMWYHRLSHLHSRLQDLLK 
GGVIYPALPQPNFKSLLPLAVHWHHTASKSLTCAWQQHEDHFEL 
KYANTVMRFDYVWLRDHCRSASCYNSKTHQRSLDTASVDLCIKP 
KTIRLDETTLFFTWPDGHVTKYDLNWLVKNSYEGQKQKVIQPRI 
LWNAE I YQQAQVPSVDCQS FLETNEGLKKFLQNFLLYGI AFVEN 
VPPTQEHTEKLAERISLIRETIYGRMWYFTSDFSRGDTAYTKLA 
LDRHTDTTYFQEPCGIQVFHCLKHEGTGGRTLLVDGFYAAEQVL 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, ^Threonine, V»Valine, 
W»Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKAPEE FELLS KSAI \KHEYIEDVGECHQPHDWDWAQS* ISTHCT" 
/ YKELYLI RYNN YDRAVINTVP YDWHRWYTAHRTLTI ELRRPE 
NEFWVKLKPGRVLFIDNWRVLHGRECFTGYRQLCGCYLTRDDVL 
NTARLLGLQA 


5901 


1 


2121 


VAIEQTSLKMMQAVGGAPARPTGEYICNQCGAKYTSLDSFQTHL 
KTHLDTVLPKLTCPQCNKEFPNQESLLKHVTIHFMITSTYYICE 
S CDKQFT S VDDLQKHLLDMHT FVF FRCTLCQ E VFDS KVS IQLHL 
\AVKHSNEKKVYRCTSC^DFRNETDLQLHVKHNHLEN^GKVHK 
C I FCGE S FGTEVELQCH I TTHS KKYN C KFCS KAFHAI I LLE KHL 
REKHCVFETKTPNCGTNGASEQVQKEEVELQTLLTNSQSSHNSH 
DGS EED VDTS EPM YGCD I CGAAYTMETLLONHOLRDHNT T Rpnpq 

AIVKKKAELIKGNYKCNVCSRTFFSENGLREHMQTHLGPVKHYM 
CPICGERFPSLLTLTEHKVTHSKSLDTGNCRICKMPIiQSEEEFL 
EHCQMHPDLRNSLTGFRCWCMQTVTSTLELKIHGTFHMQKTGN 
GS AVQTTGRGQHVQKLYKCAS CLKEFRS KQDLVKLDINGLP YGL 
CAGCVNLS KSAS PG INVP PGTNRPGLGQNENLS AI EGKGKVGGL 
KTRCS*LATFKF+VLKVELPEPHPKPFHRGVSRPDSNSTQLKTP 
QVSPMPRISPSQSDEKKTYQCIKCQMVFYNEWDIQVHVANHMID 
EGLNHE CKL CS QT FDS PAKLQCHL I EHS FEGMGGTFKCP VCFTV 
FVQANKLQQHIFSAHGQEDKIYDCTQCPQKFFFQTELQNHTMTQ 
HSS 


5902 


712 


209 


LKNRRRSRPSIRQSIGSTSVSRWLTSLFTYLDHTADVQ*V*REF " 
I PLKPRQ * ED * MFQS WLHAWGDTLEEAFEQCAMAMFG YMTDTGT 
VE PLQTVEVETQGDDLQS LLFHFLDEWL YKFS ADE FF I P \GWGE 
EFSLSKHPQGTEVKAITYSAMQVYNEENPEVFVIIDI 


5903 


2106 


735 


DTPGPSLPSTTAPFSLRSLSFPSRPSYLLPGDPQPLQGRGLPTT 
PALFALSAVPGGAASPMPPSGLRLLPLLLPLLWLLVLTPGRPAA 
GLSTCKTIDMELVKRKRIEAIRGQILSKLRLASPPSQGEVPPGP 
LPEAVLALYNS TRDR VAGES AEP E PE PEADYYAKEVTRVLM VET 
HNEIYDKFKQSTHSIYMFFNTSELREAVPEPVLLSRAELRLLRL 
KLKVEQHVELYQKYSNNSWRYLSNRLLAPSDSPEWLSFDVTGW 
RQWLSRGGEIEGFRLSAHCSCDSRDNTLQVDINGFTTGR\RGDL 
ATIHGMNRPFLLLMATPLERAQHLQS\SRHRQAL\DTNY\CFSF 
HGGRNCLRC/VHC*HLIFRKDL\GW\KWI\HE\PKGYHANFC\L 
GPCP YI WSLDTQYSKVLALYNQ\HKPG\ASAAP \ CCVPQALEP\ 
LPIVYY\VGRKPKVEQLSNMIVRSCKCS 


5904 


3 


1126 


MMEEIENAINTFKEEQRLIYEELIKEEKTTNWELSAISRKIDTW 
ALGNSETEKAFRAI S S KVPVDKVTPS TLPEEVLDFEKFLQQTGG 
RQGAWDDYDHQNFVKVRNKHKGKPTFMEEVLEHLPGKTQDEVQQ 
HEKWYQKFLALEERKKESIQIWKTKKQQKREEIFKLKEKADNTP 
VLFHNKQEDNQKQKEEQRKKQKLAVEAWKKQKSIEMSMKCASQL 
KEEEEKEKKHQKERQRQFKLKLLLESYTQQKKEQEEFLRLEKEI 
REKAE KAEKRKNAADE I S RFQERDLH KLELKI LDRQAKBD E KS Q 
KQRRLAKLKEKVENNVSRDPSRLY/NTHQRLGRTNQKDRTNRLW 
ATS T YP T* G YSNL ETRNTE KSMR 


5905 


287 


2912 


MASFPPRVNEKEIVRLRTiGELLAPAAPFDKKCGRENWTVAFAP 
DGSYFAWSQGHRTVKLVPWSQCLQNFLLHGTKNVTNSSSLRLPR 
QNS DGGQKNKPREH I ID CGD I VWSLAFGSSV P EKQ S RCVN I EWH 
RFRFGQDQLLLATGLNSGRI KI WDVYTGKLLLNLVDHTGWRDL 
TFAPDGSLILVSASRDKTLRVWDLRDDGN\MMKVLRGHQNWVY\ 
SCAFSPDSSMLCSVGASKAWAAILV*LRLCWHHSHTGATMVLS 
WAE RVAS LATG LG ATFT I G * S NLAF VLQGVL YVHR CWSMS T FCF 
SFFLFFFFKVISPTVKYH*LLSKLIFQFYGIGSLTSETNLM*SI 
WLSNGFSVLFFGILSDSRDILRL*FNLKFVLIFF*K*CIVSVQK 
KKKPKRIALLQEERLS*DKPPSSHIiI*QTEVNIRILFRAILHS* 
LLIFRI *NCI *TYS * I IDPFYIQMTYDRG*FGKNKMVKF* FIEM 
*LYYFHKIAFSFCNW*HPCCLPKKFHLAVNILFACSICFSS*A 
QVGDPSLIi+TSDYLKGRCQWSNNLLTLRFLSVYFFKNLWSGKK 
REGGL* YLTLFI S VYFS * LVFGINGFQ YS FWKLHCLYFMFRLI 
FICLTFNRN1*NRICMSALINLKTDFNLTMTLSIFFKLLIIYNA* 
YNLN*I*QF*YKMCHFVLCMSE*SYNICLFIAGF\LWNMDKYTM 
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Amino acid segment containing signal peptide " 
<A=Alanine, C=Cysteine, D=rAspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G*=Glycine, 
H=Histidine, I-Isoleucine, K»Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S= Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








irkleghhhdwacdfspdgallatasydtrvyiwdphngdilm"^ 
efghlfppptpifaggandrwvrsvsfshdgiihvasladdkmvr 

FWRIDEDYPVQVAPLSNGI^CAFSTDGSVLAAGTKDGSVYFWAT 
P RQVPSLQHLCRMS I RRVMPTQEVQELP I PS KLLEFLS YRI 


5906 


146 


2038 


REGAGSGRMASGA\YNPYIEIIEQPRQRGMRFRYKCEGRSAGSI 
PGEHSTDNNRTYPS I Q I MNY YGKGKV\R ITLVTK\NDP YKPHPH 
DL VG KD CRD \G YYEAE FGQE \ RRP \ LFFQN \LG I RCVKKKE VKE 
A\IITR\IKAGINPFDVP*KQLNDIEDCDLDWRLWFRVFLPDG 
HGNL\ TTALPPV\ VS SP I YDNRAPNTAELR VCR VNKNCGSVRGG 
DE I FL LCDKVQKDD I EVR F VLNDWEAKG I FSQADVHRQVAI VFK 
TPPYCKAITEPVTVKMQLRRPSDQEVSESMDFRYLPDEKDTYGN 
KAKKQKTTLLFQKLCQDHVETGFRHVDQDGLELLTSGDPPTLAS 
QSAGITVNFPERPRPGLLGSIGEGRYFKKEPNLFSHDAWREMP 
TGVSSQAESYYPSPGPISSGLSHHASMAPLPSSSWSSVAHPTPR 
SGNTNPLS S FS TRTL PS NS QG I P P FLRI P VGNDLNASNAC I YNN 
ADD I VGMEAS S M PSADL YG I SD PNM LS NCS VNMMTTS S D S MGE T 
DNPRLLSMNLE N PSCNS VLDPRDL RQLHQMS S S SMS AGANSNTT 
VFVSQSDAFEGSDFSCADNSMINESGPSNSTNPNSHVFVQDSQY 
SGIGSMQNEQLSDSFPYEFFQV 


5907 


99 


1873 


TYLLSSWSS**NLDTKIKSQVKV/RKGHKi<l6WPYPQPAKQNGK 
KATSKVPSAPHFVHPNDHANREAELKKKWVEEMREKQQAAREQE 
RQKRRT I E S YCQD VLRRQEE FEHKE E VLQE LNM FPQLDD EATRK 
AYYKE FRKWE YS DV I LEVLDARD P LG CRC FQME EAVLRAQGNK 
KLVLVIiNKIDLVPKEWEKWLDYLRNELPTVAFKASTQHQVKNL 
NRCS VPVDQASES LLKS KACFGAENLMRVLGNYCRLGEVRTHIR 
VG WGL PNVG KS S L IN S LKRS RACS VGAVPG I TKFMQE V YLD KF 
IRLIiDAPG I VPG PNS E VGT I LRN CVHVQKLAD P VTP VE T I LQRC 
NLEE I SNYYGVSGFQTTEHFLTAVAHRLGKKKKGGL YSQEQAAK 
AVLADWVSGK I S FYI PP PATHTLPTHLS AE I VKEMTEVFD I EDT 
EQANEDTMECLATGE SDE LLGDTDP LE ME I KLLHS PMTKI ADAI 
ENKTTVYKIGDLTGYCTNPNRHQMGWAKRNVDHRPKSNSMVDVC 
S VDRRS VLQR I METD PLQQGQALAS AL KNKKKMQ KRADK IAS KL 
SDSMMSALDLSGNADDGVGD 


5908 


247 


975 


HCGIKKRGEGSGSPSPASGGFQLGCQIPEPSLPSEEETHPHTRA 
HTRTLRATLTRRP P RSHS TRLR FPM PLDGDGGLAS W K/ PMRER * 
G WRR PAKAAGAS LGVAATG KRGCRMS KRYLQKATKGKLLI 1 1 F I 
VTL WGKWSSANHHKAHHVKTGTCEWALHRCCNKNKI EERSQT 
VKCSCFPGQVAGTTRAAPSCVDASIVEQKWWCHMQPCLEGEECK 
VLPDRKG WSCS S GNKVKTTRVTH 


5909 


1 


5002 


PAIPGSTIIWAPGSHSAARADGRHGSIiPSQSQAPGALCGARAPP 
SSNLRADRSMIC^OJ\RAGKNLYHNRFLGIiAAMAFPSRNSQSLRR 
CKEP I R YS YNPDQFHNMDLRGGPHDGVT I PRSTSDTDLVTS DSR 
STLMGRSSYYSIGHSQDLVIHWDIKEBVDAGDWIGMYLIDEVLS 
ENFLDYKNRGVNGSHRGQI I WKIDASS YFVEPETKI CFKYYHGV 
SGALRATTPSVTVKNSAAPIFKS IGADETVQGQGSRRLISFSLS 
DFQAMGLKKGMFFNPDP YLKI S IQPGKHS I F PAL PHHGQERRS K 
I IGNTVNP I WQAEQFS FVSLPTDVLE I EVKDKFAKSRP I IKRFL 
GKLSMP VQRLLERHAI GDR WS YTLGRRL PTDHVSGQLQ FRFE I 
TSSIHPDDEEISLSTEPESAQIQDSPMNNLMESGSGEPRSEAPE 
SSESWKPEQLGEGSVPDRPGNQSIELSRPAEEAAVITEAGDQGM 
VSVGPEGAGELLAQVQKDIQPAPSAEELAEQLDLGEEASALLLE 
uKacu\fi\o i jMir.fijKbbAl lybKAvjKhbKhKJLQEEEGDVSTLEQG 
EGRLQLRASVKRKSRPCSLPVSELETVIASACGDPBTPRTHYIR 
IHTLLHSMPSAQGGSAAEEEDGAEEESTLKDSSEKDGLSEVDTV 
AADPSALEEDREEPEGATPGTAHPGHSGGHFPSLANGAAQDGDT 
HPSTGSESDSSPRQGGDHSCEGCDASCCSPSCYSSSCYSTSCYS 
S S C YS AS C YS P S C YNGNRFAS HTR FS S VDS AK I S ES TVFS SQDD 
EEEENSAFESVPDSMQSPELDPESTNGAGPWQDELAAPSGHVER 
SPEGLESPVAGPSNRREGECPILHNSQPVSQLPSLRPEHHHYPT 
IDE PLP PNW EAR IDS HGR VFYVDHVNRTTTW QR PTAAAT PDGMR 
RSGSIQQMEQLNRRYQNIQRTIATERSEEDSGSQSCEQAPAGGG 
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(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glycine f 
H=Histidine, Ielsoleucine, K=Lysine, 
L«Leucine, Mt=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X -Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GGGGSDSEAES SQSSLDLRREGSLS PVNSQKI TLLLQS PAVKFI 
TNP E F FT VLHANYSAYR VFTSS TCLKHMILKVRRDARNFER YQH 
NRDLVNFINMFADTRLELPRGWEI KTDQQGKS FFVDHNS RATT F 
ID PR I PLQNGRLPNHLTHRQHLQRLR S YS AGEAS E VSRNRGASL 
LAR PGHS L VAAI RSQHQHE S LPLA YNDKI VAFLRQPN I FEMLQE 

DHDC T AT? MUTT D C VTLIV TT3 r PU i r >, 7vTLI/~«T pvt crin« nT tttt t nr nnn 

KUtoijAKJNn iJjKfciKXHi IKibCjWnGJjEKLiSCDADIiVTIiLSIjFEE 
E I MS YVPLQAAFHPG YS FS P RCS PCS S PQNS PG LQRAS ARAP S P 
YRRD FEAKLRN F YRKLE AKG FGQG PGK I KLI I RRDHLLEGT FNQ 
VMAYSRKELQRNKLYVTFVGEEGLDYSGPSREFFFL.LSQELFNP 
YYGLFEYSANDTYTVQISPMSAFVENHLEWFRFSGRILG\LALI 
HQYLLDAFFT\RPFYKALL\RLPC\D\LSDLEYLDEEFHQSLQW 
MKDNNITDILDLTFTVNEEVFGQVTERELKSGGANTQVTEKNKK 
EYIERMVKWRVERGWQQTEALVRGFYEWDSRLVSVFDARELE 
LV IAGTAE I DLNDWRNNTE YRGGYHDGHL VIRW FWAAVE RFNNE 

LPPRG\HTCLQPDWDLPTVSPRTPMLYEK\LLTA\VEETSTFGT 


5910 


1526 


! 446 


VAE FAAME PGRTQI KLDPRYTADLLEVLKTNYG I PSACFSQPPT 
AAQLLRALG P VE LALTS I LTLLALGS I AI FLEDAVYLYKNTLCP 

x rvrtn. i uunixjcftr x v v D vutL" V?JjWX IrKo xjVJj VUifll liar XAVL. 

F YLLMLVM VEG FGG KEAVLRTLRDTPMMVHTGP CCCCCP CC PRL 
LLTRKKLQ \ R* CWALSNTPS * R * R* PW WACFSS PTASMTQQTFL 
RGAQLYGSTLSSA/ CSTLLALWTLGI ISRQARLHLGEQNMGAKF 
ALFQVLLILTALQPSIFSVLANGGQIACSPPYSSKTRSQVMNCH 

LtilLETFLMTVLTRMYVTJPKnHTTVnVRTFQQianT.'nT.Krr.vaT pwm 
ijud.ua i r urix vjui rti'i x X lvin. ivLJi 1 in. V \j X Ij x s oor 1/1jJJ1j^JjJSxuj.KVVI v 1 

AWTMKGCCTH | 


5911 


109 


595 


QLPLAPCIQGKGLEMRSPKPQSFIIRSSHSGAGLLVKNPSTPVF 
CGHRRGG AAFKYKP TP VVG PEQR PTG QKHMRGGV S LLS PRLE CS 
GT I S AHCNLRLP S S SNS PAPAS * LAG I TG VCHHAQL I F VFL VE T 
GFHHVGQAGIiELL/NWIHLPRPPKVLGLQA 


5912 


924 


277 


MILNKALMLGALALTTVMSPCGGEDIVADHVASYGVNLYQSYGP 
SGQYSHEFDGDEEFYVDLERKETVWQLPLFRRFRRFDPQFALTN 
IAVLKHNIiNIVIKRSNSTAATNEVPEVTVFSKSPVTLGQPNTLI 

r , T < VTTMTPDPTA7MTTMT.C\TriTJQVTT< , /^l\/CT?T , DOCCT3T/ , Gr\Tjn , r t r\T\r\ 

VTSPSFPFE* * DL * TAKVEQLGAW FE PLLKHWGAB I PTTL 


5913 


45 


1198 


QLRMAGAEGAAGRQS ELEPWS LVDVLE EDEELENEACAVLGGS 

DS EKCS YS QGS VKRQAliYACS TCTPEGEE PAG I CLACS YECHGS 

HKLFELYTKRNFRCDCGNSKFKNLECKLLPDKAfCVNSGNKYNDN 

FFGLYCICKRPYPDPEDEIPDEMIQCWCEDWFHGRHLGAIPPE 

SGDFQEMVCQACMKRCS FLWAYAAQLAVT KI S T \GMMD WCGTLM 

P* /nnnPVT VDT?MfiPUOnCTT.VIi'nuDT?nr v nmrotrTrB-ircvMaocn 
u f \JU\£iu v x ixtr E>L\\3nj£i.\ w ru& 1 LilXlLU V tr CjU^jaX'IjVKxI VKVriU^'ofcjP 

CAGSSSESDLQTVFKNESLNAESKSGCKLQELKAKQLIKKDTAT 
YWPLNWRSKLCTCQDCMKMYGDLDVLFLTDEYDTVIAYENKGKI 
AQATDRSDPLMDTLSSMNRVQQVELI C/G IQ * FED 


5914 


960 


124 


NLGGS ELP PEEALF I Q VASMNQRR VDFYLAS I EDMLVAI /GGRN 
ENGALSSVETYSPKTDSWSYVAGLPRFTYGHAGTIYKDFVYISG 

nwriYflTf3PVT?TfMT .T ^vnWT?Tn\7TrfPT?PT?DM'TTRBr , tJTJCMr , OT _r»Tkc 
\jnu L\^±\jtr x kj\INXjXj\_ i unKiUv nCtCtt\t\iri*l 1 l/iivVjr»WoM\»oLftj]Jo 

IYSIGGSDDNIESMERFDVLGVEAYSPQCNQWTRVAPLLHANSE 
S GVAVWEGR I Y I LGGYS WENTAFS KT VQ VYDREADKWSRG VDL P 
KAI AGRCJ AC FT AP * ^T.flOPTR K R KAKARnTR TfZA cnDC r*2v cwnu 
PHRHLPGLCRPAATS 


5915 


1604 


703 


FPGRPTRPLKLGRRRKRARIIQAPHCHSPRPRTCPPGALQAPEA 
PASRAEGPVAVWNGHTEGPAPARSAPiOSPPGLPRPLGSFPCPT 
PQEDFPALGGPCPPRMPPSPGFSAVVLLKGTPPPPPPGLVPPIS 
KPPPGFSGLLPSPHP\PVSPAPPPPPPQK/RPRLLPAP/PGLPS 
PRELPGEEPSAHPVHQGLPAERRGPLQRVQEPLRGVQTGPDLRS 
PVLQ E LPGPAGGE F PEGL * +AAGPAAH 


. 5916 


256 


633 


SPRMWEIWGPWHRWESFSLEGEWPSRIPEPSPDSTKGTSGKGCR 
TVTGAVHRHLNHVAG I I P WVLHS QLKPTAATAQDQWTS QQ YPDH 
PTRLILQ * NQATADKNN* TTALLOPHQRL\VS PRMAEA 


5917 


1343 


827 


AHQILTYLEP/ ICLWNYNKILTVFLTKS VLEI * KFIHTPQTYR 
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<A=Alanine, C=Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lyeine, 
L=Leucine, M=Methionine, N**Asparagine, 
P~Proline, Q«Glutamine, R=Arginine, 
S«Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








F*NDFFGIKEVYVSRRLRKTSF/ RLAVTFLEQAWS KECVPVDQ " 
FMEHLL P S LLS LAS D P VPNVRVLLAKALRQMLLE KAY FRNAGNP 
HLEV I E ET I LALQS DRDQDVS F FAALE P KRRNI I DTAVL E KQN 


5918 


i 7 

lc3 


1247 


EGAQVARRRSRRQWRAGRCGRGRGGRRAERTGGRGPPGRPRPLP 
PGPARRGRRRMETP FYGDEALSGLGGGAS GSGGTFAS PGRLFPG 
AP PTAAAGS MMKKDALTLSLS E Q VAAALKPAPAP AS Y P PA\ ADG 
APSAAPPDGLLAS PDLGLLKLASPELERLI IQSNGLVTTTPTSS 
QFLYPKVAASEEQEFAEGFVKALEDLEKQNQLGAGRAAAAAAAA 
AGGPSGTATGSAP PG E LAPAAAAP EAP VYA\ NL9 S Y \ AGGCRGL 
RGGAAT\VAFAAEPVPFPPPPPPGALGPRRP/RLALQGRRPQTV 
PDVP\SFGESP\PLSPIET\DTPRRI\KAKRKRL\RNPQIRAPK 
PASRKLGAQSRALERESEDPS*SPEHGSLASTASLLREQVAQLK 
QKVLSHVNSGCQLLPQHQVPAY 


5919 


1 


4254 


TSVQGDSQGTPTSSQGS INMEHWISQAIHGSTTSTTSSSSTQSG 
G S GAAHRLAD VMAQTH I ENHSAPPDVTTYTSEHS IQVERPQGST 
GSRTAPKYGNAELMETGDGVPVSSRVSAKIQQLVNTLKRPKRPP 
LREFFVDDFEELLEVQQPDPNQPKPEGAQMLAMRGEQLGWTNW 
P P S LEAALQRWGT ISP KAPCLTTMDTNGKP L Y I LT YG KLWTRS M 
KVAYS I LHKLGTKQEPMVRPGDRVAL VFPNNDPAAFMAAFYGCL 
LAE WPVP I E VPLTRKDAGSQQ IG FLLGS CGVTVALTSDACHKG 
LPKSPTGEIPQFKGWPKLLWFVTESKHLSKPPRDWF\PHIKDAN 
NDTAYIEYKTCK\DGSVLGVTVTRTALLTHCQALTQACGYTEAE 
T I VNVLD FKKD VGL WHG I LTS VMNMMHVI S I P YS LMKVN P LS W I 
QKVCQYKAKVACVKS RDMHWALVAHRDQRD I NLS SLRML I VADG 
ANPWSISS CDAFLNVFQS KGLRQE VI CPCAS S PE ALTVAIRRPT 
DDSNQPPGRGVLSMHGLTYGVIRVDSEEKLSVLTVQDVGLVMPG 
AI MCS VKPDG VPQLCRTDE IGELCVCAVATGTS Y YGLSGMTKNT 
FEVFAMTSSGAPISEYPFIRTGLLGFVGPGGLVFVVGKMDGLMV 
VSGRRHNADDI VATALAVEPMKFVYRGRI AVFS VTVLHDER IVI 
VAEQRPDSTEEDS FQWMSRVLQAI DS I HQVG VYCLAL VPANTLP 
KTPLGGIHLSETKQLFLEGSLHPCNVLMCPHTCVTNLPKPRQKQ 
PEIGPASVMVGNLVSGKRIAQASGRDLGQIEDNDQARKFLFLSE 
VLQWRAQTTPDH I LYTLLNCRGAI ANS LTCVQLHKRAE KIAVML 
MBRGHLQDGDHVALVYP PG I Dh I AAFYGCLYAGCVP ITVRPPHP 
QNI ATTLPT VKM I VEVSRSACLMTTQL I CKLLRSREAAAAVDVR 
TWPLILDTDD * PKKRPAQI CKPCNPDTLAYLDFS VSTTGMLAGV 
KMSHAATSAFCRS I KLQCELYPSREVA I CLDPYCGLGFVLWCLC 
S VYSGHQS IL I P PS ELETN P ALWLLAVS QYKVRDT FCS YS VM EL 
CTKGLGSQTESLKARGLDLSRVRTCVWAEERPR I ALTQS FSKL 
FKDLGLHPRAVSTSFGCRVNIAICLO^TSGPDPTTVYVDMRALR 
HDRVRLVERGSPHSLPLMESGKILPGVRIIIANPETKGPLGDSH 
LGEIWVHSAHNASGYFTIYGDESLQSDHFNSRLSFGDTQTIWAR 
TGYLGFLRRTELTDANGERHDALYWGALDEAMELRGMRYHPID 
IETSVIRAHKSVTECAVFTWTNLLWWELDGSEQEALDLVPLV 
TNWLEEHYLIVGVWWDIGVIPINSRGEKQRMHLRDGFLADQ 
LDPIYVAYNM 


5920 


1381 


1499 


QLGAVAHAGVSRIPP*LFPPLHPTFLSLWCLHHKLP/HPPGASM 
VRPPWPRRPPAH I SS VRQASTQVPRTVPHTQRVANIGTQTTGP 
SGVGCCTPGRPLLPCKCSSAAHSTYRVQEPAVHIPGQEPLTASM 
LAAAPLHEQKQMIGERLYPL I HDVHTQLAGKITGMLLE IDNS EL 
LLMLES PES LHAK I DEAVAVLQ AHQAMEQ P KAYMH 


5921 


727 


157 


VCPGTGGE*GLWGQLGGLPKETPLKPMDAFTGSGLKRKFDDVDV" 
GSSVSNSDDEISSSDSADSCDSLNPPTTASFTPTSILKRQKQLR 
RKNVR FDQVTVYYFARRQGFTS VPSQGGSSLGMAQRHNS VRS YT 
LCEFAQEQEVNHREILREHLKEEKLHAKKMKLTKNGTVESVEAD 
GLTLDDVSDEDIDVENVEVDDYFFLQPLPTKRRRALLRASGVHR 
IDAEEKQELRAIRLSREECGCDCRLYCDPEACACSQAGIKCQVD 
RMS FPCGCS RDGCGNMAGR I E FNP I R VRTH YLHT I MKLELES KR 
Q\GAAQQPQ\*GALPDCQLQPDRSTGL*DPSWIGSKGLSFTGKG 
AAATHLI ILRVIENRGAEGKRK 


5922 


2475 


495 j S YSNWGLFPS VFIQVPRSRTGNLKPIFLFYS YYE \CMETLKG \T 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E=' 
Glutamic Acid, F*Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V«Valine, 
W=*Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /«possible nucleotide deletion, 
\*possible nucleotide insertion) 








CLYNATQYKVCSPRNDRPDACYNPSEPAATTVFEIRTGLLLGDT - 
SKIITRTEEKEIPKQITLRFDACAAINSKKLEIGCGSLN*ERS+ 
RVENKYVCHESGVCKNCAYWPCVI *AT*KKNKNDSVYLQKGEAN 
PS CAAGHCNPLELI I TNPLDPHWKKGERVTLGINRTGLKPQVVI 
L I KGEVHKCS P KP VFQTF YEELNLPAPELLKKTKNL FLQLAENV 
IFLLNGTSCYVRGGTTIGDRWPWEA*ELVPTDPAPDI IPI * KAE 
ASNF * VLKTS 1 1 RQYCI AREGKD F 1 1 P VGKPNC IGQ KL YNSTTK 
TIT** DLNHTE KN P F S KFS KLKTA* AHAE S H * DWT VPS G L Y * IC 
RHRAYFRLPNKWADSCVIGTI KPS FFLLP I KMGELLGFS VYASR 
E KKG I V IGNWKDNEW P RER 1 1 Q Y YG P ATWAQDG S WG YR /TP/ VY 
MLNWI IRLQAILEI ISNETGRALTVLAWQETQMRNAI YQNRLAL 
DYLLVAEGGVCRKFNLTNCCLQINDQGQWKNIVRDMTKLAHVP 
IQVWHKFDPESLFGKWFPAIGGFKTLIVGVLLVIRTCLLLPCVL 
PLLFQMIKGIVATLVHQKTSAHVNYMNHYRSISQRDSKSEDESE 
NSH 


5923 


137 


638 


QLCGRRGQRFRTS I KRMHPI * RTCPNTNL/ I ILtSQENTQ IRDL 
QQENRELWIS LEEHQDALEL I MS KYRKQMLQLM VAKKAVDAE P V 
LKAHQSHSAE I ESQI DR I CEMGEVMRKAVQVDDDQFCKI QEKLA 
QLELENKELRELLSISSESLQARKENSMDTASQAIK 


5924 


274 


2146 


EKGKVKDAGAEQWISLSLSCKGSWETQFSNHLNSLTPPTSVRRM "" 
PL I TTVTLLKM VARHHM KLLCS KAFS TQLQQ KI FLHSQMG I HHQ 
SVCMKLKPNTSHI ISILMGQPMALVQLETLAPLTI I IQKFQTQD 
HMKFWKNLPLHSHHLTPSVPQTVIPKKTGSPE I KLKITKTIQNG 
RELFES S LCX3DLLWE VQAS E \Q * NQS I ESRKEKRKKSNKHDS SR 
SEERKSHKIPKLEPEEQNRPNERVDTVSEKPREEPVLKEGSPSS 
ANT I FCSNNGSVHW \ FKFQVGDLVWS KVGTYPWWPCMVSSDPQL 
EVHTKINTRGAREYHVQFFSNQPERAWVHEKRVREYKGHKQYEE 
LLAEATKQASNHSEKQKIRKPRPQRERAQWDIGIAHAEKALKMT 
REER I EQ YTF I Y I D KQPEE ALSQAKKS VAS KTE VKKTRRPRS VL 
NTQPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEEPPPVKIAW 
KTAAARKSLPAS I TMHKGSLDLQKCNMS P WKIEQVFALQNATG 
DGKFI DQFVYSTKG I GNKTEI S VRGQDRLI I STPNQRNEKPTQS 
VSSPEATSGSTGSVEKKQQRRSIRTRSESEKSTEWPKKKIKKE 
QVGFLHVES 


5925 


21* 


1911 


MMTAESREATGLSPQAAQEKDGIVIVKVEEEDEEDHMWGQDSTL " 
QDTPPPDPEXFRQRFRRFCYQNTFGPREALSRLKELCHQWLRPE 
INTKEQILELLVLEQFLSILPKELQVWLQEYRPDSGEEAVTLLE 
DLE LDLS GQQVPGQVHG P EMLARGM VP LDP VQE S S S FDLHHEAT 
QSHFKHSSRKPRLLQSRALPAAHIPAPPHEGSPRDQAMASALFT 
ADS QAM V K I EDMAVS L I L E E WGCQNLARRNLS RDNRQEN YG S AF 
PQGGENRNENEESTSKAETSEDSASRGETTGRSQKBFGEKRDQE 
GKTGERQQKNPEEKTRKEKRDSGPAIGKDKKTITGERGPREKGK 
GLGRSFSLSSNFTTPEEVPTGTKSHRCDECGKCFTRSSSLIRHK 
I IHTGEKP YECSECGKAF\SLNS \NLVLHQRI \HTGEKPHECNE 
CGKAFSHSSNLILHQRIHSGEKPYECNECGKAFSQSSD\LTKHQ 
R IHTGEKP YECSECGKAFNRNSYLILHRRVHTREKPYKCTKCGK 
\ AFTRSS TLTLHHR I HARERAS E YS PAS LDAFGAFLKS C V 


5926 


2 


233 


DRCLMLKQGS Q PGS P PAT/ CE P PAP P VYQAPCQS CPE P PGAHE P " 
SDSPHHTPVHPPPEHSAACPAPATCCPPPRSSMS 


5927 


4146 


1248 


KHFSKFGSQALYQLKRPASGQNS ISVMPAQKITKPAAKYGI PLA " 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAAR 
KRR LE FI EKE KKQKDQ IIS LMKAEQMKRQEKERLER I NRAREQG 
WRNVLSAGGSGEVKAPFLGSGGT I APSS FS SRGQYEHYHAI FDQ 
MQQQRAEDNEAKWKRE I YGRGLP ERQKGQLAVE RAKQVE E FLQR 
KREAMQNKARAEGHMG I LQNLAAM YGGRP SS S RGGKPRNKEEEV 
YLARLRQI RLQNFNE RQQ I KAKLRGE KKEANHS EGQEGS E EADM 
RRKK\ IESLKAHANARAAVLKEQLERKRKEAYEREKKVWEEHLV 
AKGVKSSDVSPPLGQHETGGSPSKQQMRSVISVTSALKEVGVDS 
SLTDTRETSEEMQKTNNAISS KRE I LRRLNENLKAQEDE KGKQN 
LSDTFEINVHEDAKEHEKEKSVSSDRKKWEAGGQLVIPLDELTL 
DTS FSTTERHTVGEVI KLGPNGS PRRAWGKS PTDSVLKI LGEAE 



402 



WO 01/53312 



PCTAJS00/34263 



SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, VoValine, 
W«Tryptophan, Y«Tyrosine, X= Unknown , *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQLQTELLENTTIRSEISPEGEKYKPLITGEKKVQCISHEINPS 
AIVDSPVETKSPEFSEASPQMSLKLEGNLEEPDDLETEILQEPS 
GTNKDE\SLPCTITDVWISEEKETKETQSADRITIQENEVSEDG 
VSSTVDQLSDIHIEPGTNDSQHSKCDVDKSVQPEPFFHKVVHSE 
HLNLVPQVQSVQCSPEESFAFRSHSHLPPKNKNKNSLLIGLSTG 
L FDANNP KM L RTCS LPDLS KLFRTLMDVPTVGD VRQDNLE I DE I 
EDENI KEG P SDS EDI VFEETDTDIiQELQAS MEQLLREQPGEE YS 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 
GEIASECECDSVFNHLEELRLHLEQEMGFEKFFEVYEKIKAIHE 
DEDENI E I CS KI VQNI LGNEHQHLYAKI LHLVMADGAYQEDNDE 


5928 


4146 


1248 


KHFSKFGSQALYQLKRPASGQNSISVMPAQKITKPAAKYGIPLA 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAAR 
KRRLEFIEKEKKQKDQI ISLMKAEQMKRQEKERLERINRAREQG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAI FDQ 
MQQQRAEDNEAKWKRE I YGRGL PE R QKGQLAVERAKQVE E FLQR 
KREAMQNKARAEGHMGILQNLAAMYGGRPSSSRGGKPRNKEEEV 
YLARLRQI RLQNFNERQQ I KAKLRGEKKEANHSEGQEGSEEADM 
RRKK\ IESLKAHANARAAVLKEQLERKRKEAYEREKKVWEEHLV 
AKGVKSSDVSPPLGQHETGGSPSKQQMRSVISVTSALKEVGVDS 
SLTDTRETSEEMQKTNNAI SSKRE I LRRLNENLKAQEDEKGKQN 
LSDTFE I NVH EDAKEHE KE KS VSS DRKKWEAGGQLVI PLDELTL 
DTS FS TTERHTVGEVI KLG PNGSPRRAWGKS PTDS VliKI LG EAE 
LQLQTELLENTTIRSEISPEGEKYKPLITGEKKVQCISHEINPS 
AIVDSPVETKSPEFSEASPQMSLKLEGNLEEPDDLETEILQEPS 
GTNKDE\ SLPCT ITDVW I SEE KETKETQS ADR I TIQENEVS EDG 
VSSTVDQLSD IH I E PGTNDSQHSKCDVDKSVQPEP FFHKWHSE 
HLNLVPQVQS VQCS PEES FAFRSHSHLP PKNKNKNSLL IGLSTG 
L FDANNP KMLRTCS LPDLS KLFRTLMDVPTVGD VRQDNLE I DE I 
EDENIKEGPSDSEDIVFEETDTDLQELQASMEQLLREQPGEEYS 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 
GEIASECECDSVFNHLEELRLHIjEQEMGFEKFFEVYEKIKAIHE 
DEDENIEICSKI VQNII/jNEHQHIjYAKI LHLVMADGAYQEDNDE 


5929 


3 


1558 


LDFSMTTQLPAYVA I LLF YVS RAS CQDT FTAAVYEHAAI LPNAT 
LTPVSREEALALMNRNLDILEGAITSAADQGAHI IVTPEDAIYG 
WNFNRDSLYPYLEDIPDPEVNWIPCNNRNRFGQTPVQERLSCL\ 
AKNNS I YWANI GDKKPCDTSDPQCPPDGRYQYNTDWF\DSQG 
KLVARYHKQNLFMGENQFNVP KE PE I VTFNTTFGS FGI FTCFDI 
LFHDPAVTLVKDFHVDTIVFPTAWMNVLPHLSAVEFHSAWAMGM 
RVNFLASNIHYPSKKMTGSGIYAPNSSRAFHYDMKTEEGKLLLS 
QLDSHP S HS AWN WTS YASS I EALS SGNKE FKGTVF FDE FT FVK 
LTGVAGNYTVCQKDLCCHLS YKMSENI PNEVYALGAFDGLHTVE 
GRYYLQICTLLKCKTTNLNTCGDSAETASTRFEMFSLSGTFGTQ 
YVF PEVL LS ENQLAPG E FQ VSTDGRL FS LKPTSGP VLTVTL PGR 
LYEKDWASNASSGLTAQAR 1 1 ML I VIAP I VCSLSW 


5930 


113 


6082 


RGNCFWIVPFTMAQRTGLEDPERYLFVDRAVIYNPATQADWTAK 
KLVW IPSE RHGFEAAS I KEERGDEVMVE LAENGKKAMVNKDD I Q 
KMNPPKFSKVEDMAELTCLNEASVLHNLKDRYYSGLIYTYSGLF 
C W I NP YKNLP I YS EN 1 1 EM YRG KKRHEM P PH I YAI S ES AYRCM 
LQDREDQSILCTGESGAGKTENTKKVIQYLAHVASSHKGRKDHN 
IPGE\LERQLLQANPILESFGNARTVQNDNSSRFGKFIRINFDV 
TG YI VGANI ETYLLEKSRAVRQAKDERTFHI FYQLLSG\AGEHL 
KSDLLLEGFNNYRFLSNGYIPIPGQ\QDKGNFRGDPGEAMHIMG 
FS HE E I LSMLKWS S VLQ FGN I S FKKERNTDQAS M P ENTVAQKL 
CHLLGMNVMEFTRAILTPRIKVGRDYVQKAQTKEQADFAVEALA 
KAT YERLFRWLVHR INKALDRTKRQGAS F IG I LDIAGFE I FELN 
SFEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 
DLQPCI DL I ERPANPPGVLALLDEECWFP KATDKTFVBKLVQEQ 
GSHS KFQKPRQLKDKADFCI IHYAGKVDYKADEWLMKNMDPLND 
NVATLLHQS SDRFVAELWKDVDR I VGLDQ VTGMTETAFGSAYKT 
KKGMFRTVGQLYKESLTKLMATLRNTNPNFVKCI IPNHEKRAGK 
LDPHLVLDQLRCNGVLEGIRICRQGFPNRIVFQEFRQRYEILTP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepticil 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=»Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W -Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NAI P KGFMDGKQAC ERM I RALELD PNL YR I GQS K I FFRAG VLAH "" 

LEEERDLKITDI 1 1 FFQAVCRGYLARKAFAKKQQQLSALKVLQR 

NCAAYLKLRHWQ WWR VFTKVKPLLQ VTRQE E E LQAKDE E LLKVK 

EKQTKVEGELEEMERKHQQLLEEKNILAEQLQAETELFAEAEEM 

RARLAAKKQELEEILHDLESRVEEEEERNQILQNEKKKMQAHIQ 

DLEEQLDEEEGARQKLQLEKVTAEAKIKKMEEEILLLEDQNSKF 

IKEKKLMEDRIAECSSQLAEEEEKAKNLAKIRNKQEVMISDLEE 

RLKKEEKTRQELEKAKRKLDGETTDLQDQIAELQAQIDELKLQL 

AKKEEELQGALARGDDETLHKNNALKWRELQAQIAELQEDFES 

EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRE 

QEVAELKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 

FKANLE KNKQGLETDNKELACEVKVLQQ VKAES EHKRKKLDAQV 

QELIIAKVSEGDRLRVELAEKASKLQNELDNVSTIjLEEAEKKGIK 

FAKDAASLESQLQDTQELLQEETRQKLNLSSRIRQLEEEKNSLQ 

EQQEEEEEARKNLEKQVLALQSQLADTKKKVDDDLGTIESLEEA 

KKKLL KDAEALS QRLE B KALAYDKLE KTKNRLQQELDDLT VDLD 

hqrqvasnlekkqvkkfdqllaeeksisaryaeerdraeaeare 

KET KALS LARALE BALEAKEE FERQNKQLRADMED LMS S KDD VG 
KNVHELEKSKRALEQQV\EEMRTQLEELEDBLQATEDAKLRLEV 
jNMy/uvi KAyt SRDLQTRDEQNEEKKRLL I KQVRELEAELEDE RKQ 
RALAVAS KKKME IDLKDLEAQIEAANKARDBVI KQLRKLQAQMK 
D YQRELEEARAS RDE I FAQS KESEKKLKSLEAEI LQLQEELASS 
E RARRHAEQE RDELADE I TNS ASGKS ALLDE KRRLEAR I AQ LEE 
ELEEEQSNMELLNDRFRKTTLQVDTLNAELAAERSAAQKSDNAR 
QQLERQNKELKAKLQELEGAVKSKFKATISALEAKIGQLEEQLE 
QEAKERAAANKLVRRTEKKLKEIFMQVEDERRHADQYKEQMEKA 
NARMKQLKRQLEEAEEEATRANASRRKLQRELDDATEANEGLSR 
EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE 


5931 


113 


6082 


RGNCFWIVPFTMAQRTGLEDPERYLFVDRAVIYNPATQADWTAK 
KLVWIPSERHGFEAASIKEERGDEVMVEIiAENGKKAMVNKDDIQ 
KMNPP KFS KVEDMAE LTCLNE AS VLHNLKDR Y YS G L I YT YS GLF 
CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQS I LCTGESGAGKTENTKKVIQYLAHVASSHKGRKDHN 
IPGE\LERQLrjQANPILESFGNARTVQNDNSSRFGKFIRINFDV 
TG Y I VGAN I ETYLLEKSRAVRQAKDERTFHI FYQLLSG \ AGEHL 
KSDLLLEGFNNYRFLSNGYIPIPGQ\QDKGNFRGDPGEAMHIMG 
FSHEEILSMLKVVSSVLQFGNISFKKERNTDQASMPENTVAQKL 
CHLLGMNVME FTRA I LT PR I KVGRD YVQ KAQ TKEQAD FAVEALA 
KAT YERLFRWL VHRI NKALDRTKRQGAS F I G I LD I AGFE I FELN 
SFEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 
DLQPCIDLIERPANPPGVLALLDEECWFPKATDKTFVEKLVQEQ 
GSHSKFQKPRQLKDKADFCIIHYAGKVDYKADEWLMKNMDPLND 
NVATLLHQSSDRFVAELWKDVDRIVGIiDQVTGMTETAFGSAYKT 
KKGMFRTVGQLYKESLTKLMATLRNTNPNFVRCI IPNHEKRAGK 
LD PHLVLDQLRCNGVLEG I RI CRQGF PNR I VFQE FRQR YE I LTP 
NAI P KGFMDG KQACE RM I RALELDPNL YR IGQS KI FFRAGVLAH 
LEEERDLKITDI IIFFQAVCRGYLARKAFAKKQQQLSALKVLQR 
NCAAYLKLRHWQWWRVFTKVKPLLQVTRQEE ELQAKDEELLKVK 
EKQT KVEGE LEEMERKHQQLLEBKN I LAEQLQAE TE LFAEAEE M 

RARLAAKKQELEEILHDLESRVEEEEERNQILQNEKKKMQAHIQ 
DLEEOLDEEEGAROKTiOTiFKVT2VPlWTirTfMin7i?TT t t wnrwroiro 

IKEKKLMEDRIAECSSQLAEEEEKAKNLAKIRNKQEVMISDLEE 
RLKKE E KTRQELE KAKRKLDGETTDLQDQ I AELQAQ IDELKLQL 
AKKEEELQGALARGDDBTLHKNNALICWRELQAQIAELQEDFES 
EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRE 
QEVAELKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 
FKANLE KNKQGLETDNKE LACEVKVLQQVKAES EHKRKKLDAQV 
QELHAKVSEGDRLR VELAE KASKLQNEL DNVS TLL EEAE KKG I K 
FAKDAAS LESQLQDTQELLQE ETRQ KLNLS S R I RQLEE E KNS LQ 
EQ^EEEEEARKNLEKQVLALQSQLADTKKKVDDDLGTIESLEEA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresp ondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L-Leucine, M«=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S= Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KKKLLKDAEALSQRLEEKALAYDKLEKTKNRLQQELDDLTVDLD 
HQRQVASNLEKKQ\KKFDQLLAEEKSISARYAEERDRAEAEARE 
KETKALS LARALEEALEAKEE FERQNKQLRADMEDLMSS KDDVG 
KNVHELEKSKRALEQQV\EEMRTOLEELEDPLnATEDAK , T,RT,PV 
NMQAMKAQFERDLQTRDEQNEEKKRLLIKQVRELEAELEDERKQ 
RALAVASKKKMEIDLKDLEAQI EAANKARDEVIKQLRKLQAQMK 
D YQRELEB ARAS RDE I FAQS KE S E KKLKS LEAEI LQLQEELAS S 
ERARRHABQERDELADE ITNSASGKSALLDEKRRLEARIAQLBE 
ELEEEOSNMELLNDRFRKTTLOVDTI 1 NAETiAA.i?R€?aaoK , QnKrap 
QQLERQNKELKAKLQELEGAVKS KFKATI S ALEAK IGQLEEQLE 
QEAKERAAANKLVRRTEKKLKE I FMQVEDERRHADQ YKEQME KA 
NARMKQLKRQLEEAEEEATRANASRRKLQRELDDATBANEGLSR 
EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE 


5932 


33 


572 


RHLEE I CFL F LQ KGRKLKLSGPR W E EG KP RGTGGLW VKAEANMG 
FGATLAVGLTI FVLS WTI I ICFTCSCCCLYKTCRRPRPV\APP 
PHPP/PVVHAPYPQPPSVPPSYPGPSYQGYHTMPPQPGMPAAPY 
PMQ YP P P Y PAQ P MGP PAYHETLAGGAAAP Y PAS QPP YNPAYMDA 
PKAAL 


5933 


1 


3190 


GTRKLKMADKT PGGS QKAS S KT R S S D VHS S G S S DAHMDASG PSD 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFSIGKMSTAKRTLSKKEQEELKKKEDEKAAAEIYEEFLAAFEG 
SDGNKVKTFVRGGVVNAAKEEHETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERDERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTT\NFYIiGNl\NPQMNLKKCCCQEFGRFGP 
LASVKIMWPRTDEERARERNCGFVAFMNRRDAERALKNLNGKMI 
MSFEMKLGWGKAVP I PPHP I YI PPSMMEHTLPP P PSGLP FNAQP 
RERLKNPNAPMLPPPKNKEDFEKTLSQAIVKWIPTERNLLALI 
HRM I E FWREGPMFEAMI MNREINNPMFRFLFENQTPAHVYYRW 
KLYS I LQGDS PTKWRTEDFRMFKNGS FWRPPPLNPYLHGMS E EQ 
ETEAFVEEPSKKGALKEEQRDKLEEILRGLTPRKNDIGDAMVFC 
LNNAEAAEE I VDC I TES LSI LKTPL P KKIARL YL VS DVL YNS S A 
KVANAS YYRK FF ETKLCO I PS DL.NAT YRT T rv?HT .n iswi? trnp \tm 
TCFRAWEDWAIYPEPFLIKLQNIFLGLVNIIEEKETEDVPDDLD 
GAP IEEELDGAPLEDVDG I P I DAT PI DDLDGVP I KSLDDDLDGV 
PLDATEDS KKNE P I FKVAPS KWEAVDES ELEAQAVTTSKWELFD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHHLYSNPIKEEMTE 
S KFSKYSEMSEEKRAKLRE IELKVMKFQDELESGKRPKKPGQSF 
QEQVEHYRDKLLQREKEKELERERERDKKDKBKLESRSKDKKEK 
DECTPTRKERKRRHSTSPS PSRSSSGRRVKSPS PKS ERSERS ER 
SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 
LCPERSVF 


5934 


1 


3190 


GTRKLKMADKTPC^SQKASSKTRSSDVHSSGSSDAHMDASGPSD 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFSIGKMSTAKRTLSKKEQEELKKKEDEKAAAEIYEEFLAAFEG 
SDGNKVKTFVRGGWNAAKEEHETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERDERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DD YAPGS HD VGD PS TT \NF YLGN I \NPQMNLKKCC CQE FGRFG P 
LASVKIMWPRTDEERARERNCGFVAFMNRRDAERALKNLNGKMI 
MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 
RERLKNPNAPMLPPPKNKEDFEKTLSQAIVKVVIPTERNLLALI 
HRM IE FWREGPMFEAMI MNR E I NNPMFR FLFENQTPAHVYYR W 
KLYS I LQGDS PTKWRTE DFRM FKNGS FWRP P PLNP YLHGMS EE Q 
ETEAFVEEPSKKGALKEEQRDKLEEILRGLTPRKNDIGDAMVFC 
LNNAEAAEE I VDCI TES LS I LKT P LP KK I ARL YLVSD VL YNS S A 
KVANAS YYRKFFETKLCQIFSDLNATYRTIQGHLQSENFKQRVM 
TCFRAWEDWAIYPEPFLIKLQNIFLGLVNIIEEKETEDVPDDLD 
GAP I EEE LDG AP LED VDG I P I DAT P I DDLDGVP I KS LDDDLDG V 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L^Leucine, M«Methionine, N=Asparagine, 
P= Proline, Q^Glut amine, R=Arginine, 
SaSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLDATEDS KKNEP I FKVAP S KWEAVDES E LE AQAVTTS KWELFD " 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHHLYSNPIKEEMTE 
S KFS KYS EMSEEKRAKLREI ELKVMKFQDELESGKRPKKPGQS F 
QEQVEH YRDKLLQRE KE KELERERERDKKDKEKLESRS KDKKEK 
DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSERSER 
SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSG KK S RS Q S RS PH RS HKKS KGKTNTGRKFFKKAVTYW KCDLF 
LCPERSVF 


5935 


3 


4493 - 

• 


SYWLSGWRLSRPPRQFWAGWRGIGRFGTMAPVHGDDCEIGASAIi 
S DS G S FVS S RARRE KKS ICKGRQEALE R LKKAKAGERYKYE VEDF 
TGVYE EVDEEQ YS KLVQARQDDDW I VDDDG I G YVEDGRE I FDDD 
LEDDALDADEKGKDGKARNKDKRNVKKLAVTKPNNIKSMFIACA 
GKKTADKAVDLSKDGLLGDILQDLNTETPQITPPPVMILKKKRS 
IGASPNPFSVHTATAVPSGKIASPVSRKEPPLTPVPIiKRAEFAG 
DDVQVESTEEEQESGAMEFEEGDFDEPMEVEEVDLEPMAAKAWD 
KESEPAEEVKQEADSGKGTVSYLGSFLPDVSCWDIDQEGDSSFS 
VQEVQVDSSHLPLVKGADEEQVFHFYWLDAYEDQYNQPGWFLF 
GKVWIESAETHVSCCVMVKNIERTLYFLPREMKrDLNTGKETGT 
PISMKDVYEEFDEKIATKYKIMKFKSKPVEKNYAFEIPDVPEKS 
EYLEVKYSAEMPQLPQDLKGETFSHVFGTNTSSLELFLMNRKIK 
GPCWLE VKKS TALNQP VS WCKVEAMALKPDLVNVI KDVS P P PLV 
VMAFSMKTMQNAKNHQNE 1 1 AMAALVHHS FALDKAAPKP P FQSH 
FCWSKPKDCIFPYAFKEVIEKKNVKVEVAATERTLLGFFLAKV 
HKIDPDIIVGHNIYGFELEVLLQRINVCKAPHWSKIGRLKRSNM 
P KLGGRSGFGERNATCGRM I CDVEI SAKELIRCKS YHLSELVQQ 
ILKTERWIPMENIQNMYSESSQLLYLLEHTWKDA\KFILQIMC 
ELNVLPLALQ I TNI AGNI MSRTLMGGRSERNE FLLLHAFYENNY 
IVPDKQIFRKPQQKLGDEDEEIDGDTNKYKKGRKKGAYAGGLVL 
DPKVGFYDKF I LLLDFNS LYPS I IQE FNICFTTVQRVASEAQKV 
TEDGEQEQIPELPDPSLEMGILPREIRKLVERRKQVKQLMKQQD 
LNPDL I LQYD IRQKALKLTANSMYGCLGFS YSR FYAKPLAALVT 
YKGRE I LMHT KEMVQKMNLE VI YGDTDS IM INTNSTNLEEVFKL, 

GNYVTKQELKGLDI VRRDWCDLAKDTGNFVIGQ I LSDQSRDT I V 
ENIQKRLIEIGENVLNGSVPVSQFEINKALTKDPQDYPDKKSIiP 
HVHVALW I NSQGGRKVKAGDT VS YV I CQDGSNLTAS QRAYAP EQ 
LQKQDNLT I DTQ YYLAQQ I HP WAR I CEP I DG I DAVL I ATG WEL 
\ DPTQ FKVHHYH KDE ENDALLGG PAQ LTDEEKYRDCER FKCPCP 
TCGTEN I YDNVFDG SGTDME PS L YRCS NIDCKAS PLTFTVQLSN 
KLIMDIRRFIKKYYDGWLICEEPTCRNRTRHLPLQFSRTGPLCP 
ACMKATLQPEYSDKS L YTOLCFYRYI FDAECALEKLTTDHRKn^ 
LKKQFFTPKVLQDYRKLKNTAEQFLSRSGY9EVNLS KLFAGCAV 
KS 


5936 


1124 


139 


RGEEQFDAEFRRFACLGFGERLQEFSRLLRAVHRSRAWTCYLA* 
RMLMATCCPSPTTTACTG P WQRAPPLRLLVQKREADS S GLAFAS 
NSLQRRKKGLLLRPVAPLRTRPPLLISLPQDFRQVSSVIDVDLL 
PETHRRVRLHKHGSDRPLGFYIRDGMS VRVAPQG \LER VPG I FI 
SRLVRGGLAESTGLLAVSDEI LE VNG I EVAGKTLNQVTDMMVAN 
SHN\LIVTVKPANQRNNWRGASGRLTGPPSAGPGPAEPDSDDD 
SSDLVIENRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPSLDD 
QEQASSGWGSRIRGDGSGFSL 


5937 


31 


1600 


P TS LL KS T VQLMCRLLQDKR YQ C VYSLAE I FKVLAS FYV I LVI L 
YGLTSSYSLWWMLRSSLKQYSFEALREKSNYSDIPDVKNDFAFI 
IJiLADQYDPLYSKRFSIFliSEVSENKLKQINLNNEWTVEKLKSK 
LVKNAQDKIELHLFMLNGLPDNVFELTEMEVLSLELIPEVKLPS 
AVSQLVNLKE LRVYHS S LWDHPAIiAFLEENLKILRLKFTEMGK 
IPRWVFHLKNLKELYLSGCVLPEQLSTMQLEGFQDLKNLRTLYL 
KS SLSR I PQWTDLLPSLQKLSLDNEGS KLWLNNLKKMVNLKS 
LELISCDLERI PHS I FSLNNLHELDLRENNLKTVEEI I SFQHLQ 
NLS CLKLWHNNI AY I PAQ IGALSNLEQLS LDHNNI ENL PLQLFL 
CTKLHYLDLSYNHLTFIPEEIQYL\SNLQYFAVTNNNIEIMLPDG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing Bignal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid # F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T»Threonine, V« Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LFQCKKLQCLLLGKNSLMNLSPHVGELSNLTHREPIG\NYIjETL 
PPELEGCQSLKRNCLIVEENLLNTLPLPVTERLQTCLDKC 


5938 


395 


1865 


YKGEGFFCNQEARGERRKKKKAMSSPNIWSTGSSVYSTPVPSQK 
^rcWILLLLSLYPGFTSQKSDDDYEDYASNKTWVLTPKVPEGDV 
TVILNNLLEGYDNKLRPDIGVKPTLIHTDMYVNSIGPVNAINME 
YTIDIFFAQTWYDRRLKFNSTIKVLRLNSNMVGKIWIPDTFFRN 
S KKADAH W I TTPNRMLRI WNDGRVLYSLRLTI DAECQLQLHNFP 
MDEHSCPLEFSSYGYPREBIVYQWKRSSVEVGDTRSWRLYQFSF 
VGLRNTTEWKTTSGDYWMSVYFDLSRRMGYFTIQTYIPCTLI 
WLSWVSFWINKDAVPARTSLGITTVLTMTTLSTIARKSLPKVS 
YVTAMDLFVSVCFIFVFSALVEYG\TLHYFVSNRKPSKDKDKKK 
KNPAPTID IRPRS AT I QMNNATHLQERDEE YGYECLDGKDCAS F 
F CCFEDCRTGAWRHGR I H I RIAKMDS YAR I F F P TAFCLFNL VYW 
VSYLYL ' 


5939 


66 


1404 


I R PG YLKEVQENS PGHRAGLE P F FDF I VS I NGS RLNKDNDTLKD 
LLKANVEKPVKMLIYSSKTLELRETSVTPSNLWGGQGLLGVSIR 
FCSFDGANENVWHVLEVESNSPAALAGLRPHSDYIIGADTVMNE 
S EDL FS L I ETHEAKP LKL YV YNTDTDNCREV 1 1 TPNSAWGGEGS 
LG CG I G YG YLHR I PTR P FE SGKK I S L PGQMAGT P I TPL KDGFTE 
VQLSSVNPPSLSPPGTTGIEQSLTGLSISSTP\PAVSSVLSTGV 
PTVP\LLPPQVNQSLTSVPPMESSYLHLPGLMPFTRQGLPNLPQ 
PSTFNLPR\PTHSWPGVGLYQEFVKPGVLPPLSSMPPRNLPG\I 
APLPLPSEFLPSFPLVPESSSAASSGELLSSLPPTSNAPSDPAT 
TTAKADAAS S LTVDVTP PTAKAPTTVEDRVGDS TP VSE KP VS AA 
VD ANAS ESP 


5940 


145 


717 


RRSASRSASPRQSAGTAVTTGTRAGGTCLAAAHHRMRWRADGRS 
LEKLP VHMGLVI TEVEQEPS FSD IAS LWWCMAVGI S YI SVYDH 
QGIFKRNNSRLMDEILKQQQELLGLDCSKYSPEFANSNDKDDQV 
LNCHliAVKVLSPEDGKADIVRAAQDFCQLVAQKQKRPTDLDVDT 
LA\VYLVQMWLILI 


5941 


13 


6147 


MCLGRMGASSPRSPEPVGPPAPGLPFCCGGSLLAVWLLALPVA 

WGQCNAPEW\LPFARPTNLTDEFEFPIGTYLNYECRPGYSGRPF 

S IICLKNSVWTGAKDRCRRKS CRNPPDPVNGMVHVI KGIQFGSQ 

IKYSCTKGYRLIGSSSATCIISGDTVIWDNETPICTRIPCGLPP 

TITNGDFISTNRENFHYGSWTYRCNPGSGGRKVFEiiVGEPSIY 

CTSNDDQVGIWSGPAPQCI I PNKCTP PNVENG I LVS DNRSL FS L 

NEWE FRCQPGFVMKGPRR VKCQALNKWE P E LP S CS R VCQP P P D 

VLHAERTQRDKDNFS PGQE VFYSCEPGYDLRGAASMRCTPQGDW 

SPAAPTCEVKSCDDFMGQLLNGRVLFPVNLQLGAKVDFVCDEGF 

QLKGSSASYCVLAGMESLWNSSVPVCEQIFCPSPPVIPNGRHTG 

KPLEVFPFGKAVNYTCDPHPDRGTSFDLIGESTIRCTSDPQGNG 

VWSSPAPRCGILGHCQAPDHFLFAKLKTQTNASDFPIGTSLKYE 

CRPE Y YGR P FS I TCLDNLVWS S PKDVCKI^KS CKTP PD PVNGM VH 

VITDIQVGSRINYSCTTGHRLIGHSSAECILSGNAAHWSTKPPI 

CQRIPCGLPPTIANGDFISTNRENFHYGSWTYRCNPGSGGRKV 

FE LVGE PS I YCTSNDDQVG I WSGPAPQC 1 1 PNKCTPPNVENG I L 

VSDNRSLFSLNEVVEFRCQPGFVMKGPRRVKCQALNKWEPELPS 

CSRVCQPPPDVLHAERTQRDKDNFSPGQEVFYSCEPGYDLRGAA 

SMRCTPQGDWSPAAPTCEVKSCDDFMGQLLNGRVLFPVNliQLGA 

KVDFVCDEGFQLKGSSASYCVLAGMESLWNSSVPVCEQIFCPSP 

P V I PNGRHTGKPLEVFP FGKAVNYTCDPHPDRGTSFDL 1GES T I 

RCTSDPQGNGVWSSPAPRCGILGHCQAPDHFLFAKLKTQTNASD 

FPIGTSLKYECRPEYYGRPFSITCLDNLVWSSPKDVCKRKSCKT 

PPDPVNGMVHVITDIQVGSRINYSCTTGHRLIGHSSAECILSGN 

TAHWSTKP P I OQR I PCGLP P T I ANGDF I STNRENFH YGS WTYR 

CNLGSRGRKVFEL VGE P S I YCTS NDDQVG I WSG PAPQC 1 1 PNKC 

TPPNVENGILVSDNRSLFSLNEWEFRCQPGFVMKGPRRVKCQA 

LNKWEPE LPS CSR VCQP P PE I LHGEHTPSHQDNFS PGQE VF YS C 

E PG YDLRGAAS LHCTPQGDWS PEAPRCAVKSCDDFLGQLPHGRV 

LFPLNLQLGAKVSFVCDEGFRLKGSSVSHCVLVGMRSLWNNSVP 

VCEHIFCPNPPAILNGRHTGTPSGDIPYGKEISYTCDPHPDRGM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Aeparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V« Valine, 
W-Tryptophan, Y«Tyrosine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TFNLIGESTIRCTSDPHGNGWSSPAPRCELSVRAGHCKTPEQF - 

PFASPTIPINDFEFPVGTSLNYECRPGYPGKMPSISCLENLVWS 

SVEDNC3^KSCGPPPEPFNGMVHINTDTQFGSTVNYSCNEGFRL 

IGS PSTTCLVSGNNVTWDKKAP ICE I ISCEPPPTISNGDFYSNN 

RTS FHNGTVVTYQCHTGPDGEQLFELVGERS I YCTS KDDQVGVW 

SSPPPRCISTNKCTAPEVENAIRVPGNRSFFSLTEIIRFRCQPG 

FVMVGSHTVQCQTNGRWGPKLPHCSRVCQPPPEILHGEHTLSHQ 

DNFSPGQEVFYSCEPSYDLRGAASLHCTPQGDWSPEAPRCTVKS 

CDDFLGQLPHGRVLLPLNLQLGAKVSFVCDEGFRLKGRSASHCV 

LAGMKALWNSSVPVCEQI FCPNPPAILNGRHTGTPLGDI PYGKE 

VSYTCDPHPDRGMTFNLIGESTIRRTSEPHGNGVWSSPAPRCEL 

PVGAACPHPPKIQNGHYIGGHVSLYLPGMTISYTCDPGYLLVGK 

GFIFCTDQGIWSQLDHYCKEVNCSFPLFMNGISKELEMKKVYHY 

GDYVTLKCEDGYTLEGSPWSQCQADDRWDPPLAKCTSRTHDALI 

VGTLSGTIFFILLI IFLSWI ILKHRKGNNAHENPKEVAIHLHSQ 

GGSSVHPRTLQTNEENSRVLP 


5942 


4509 


S88 


YLYVRMRANPLAYGISHKAYQIDPPL\RKHREQ\LVIE\VGRKL 
DK\AQMIRFEERTGYFSSTDLGRTASHYYIKYNTIETFNELFDA 
HKTEGDIFAIVSKAEEFDQIKVREEEIEELDTLLSNFCELSTPG 
GVENS YGKINI LLQTYINRGEMDSFSL I SDSAY VAQNAAR I VRA 
LFE IALRKRWPTMTYRLLNLS KAIDKRLWGWAS PLRQFS I L P PH 
MLTRLEEKKLTVDKLKDMRKDE IGHI LHHVN IGLKVKQCVHQ I P 
SVMMEAF I QP ITRTVLRVTLS I YADFTWNDQVHGTVGEPWW I WV 
EDPTNDHIYHSEYFLALKKQVISKEAQLLVFTIPIFEPLPSQYY 
I RAVS DR WLGAEAVCI INFQHL ILPERHP PHTELLDLQPLP I TA 
LGCKAYEALYNFSHFNPVQTQ I FHTL YHTDCNVLLGAPTGSGKT 
VAAELAI FRVFNKY PTS KAVY I APLKAL VRERMDD W KVRI E E KL 
GKKVI ELTGDVTPDMKS I AKADLIVTTPEKWDGVSRS WQNRN YV 
QQVTILIIDEIHLLGEERGPVLEVIVSRTNFISSHTEKPVRIVG 
LSTALANARDLADWLNIKQMGLFNFRPSWPVPLEVHIQGFPGQ 
HYCPRMASMNKPAFQAIRSHS PAKPVLI FVSSRRQTRLTALELI 
AF LATEED PKQ WLNMDERE MEN 1 I AT VRD SNLKLTLAFGI GMHH 
AGLHERDRKTVEELFVNCKVQVLIATSTLAWGVNFPAHLVIIKG 
TEYYDGKTRRYVDFPITDVLQMMGRAGRPQFDDQGKAVILVHDI 
KKDFYKKFLYEPFPVESSLLGVLSDHIiNAEIAGGTITSKQDALD 
Y I TWTYFFRRLIMNPS YYNLGDVSHDS VNKFLSHL I EKSLI ELE 
LSYCIEIGEDNRSIEPLTYGRIASYYYLKHQTVKMFKDRLKPEC 
STEELLSIIiSDAEEYTDLPVRHNEDHMNSELAKCLPIESNPHSF 
DSPHTKAHLLLQAHLSRAMLPCPDYOTDTKTVLDQALRVCQAML 
DVAANQGWLVTVLN ITNLI QMV I QGRWL KDS SLLTLPN I ENHHL 
HLFKKWKPIMKBPHARGRTSIECLPELIHACX3GKDHVFSSMVES 
ELHAAKTKQAWNFLSHLPEINVGISVKGSWDDLVEGHNELSVST 
LTADKRDDNKWIKLHADQEYVLQVSLQRVHFGFHKGKPESCAVT 
PR FP KS KDEGWFL I LGE VDKRELI ALKR VG Y I RNHHVAS LS F YT 
PE I PGRY I YTL YFMSDC YLG LDQQ YD/NLSQR YTS E S FCTGQHQ 
GL 


5943 


1 


2274 


DK P TRHKT YLS SS WAKMAAAEG P VGDGEL W<2 T WLPNHVVFLRLR 
EGLKNQSPTEAEKPASSSLPSSPPPQLLTRNWFGLGGELFLWD 
GEDSSFLWRLRGPSGGG\EEPALSQYQRLLCINPPLFEIYQVL 
LS PTQHHVAL IGIKGLMVLELPKRWGKNSE FEGGKSTVNCSTTP 
VAERFFTSSTSLTLKHAAWYPSEILDPHWLLTSDNVIRIYSLR 
E PQTP TNVI I LS EAEEE S LVLNKGRAYTAS LGETAYAFD FGPLA 
AVPKTLFGQNGKDEWAYPLYILYENGETFLTY I SLLHS PGN/ 1 
W KAVG S IAHAS \ AAEDNYG YDACAVL CL P CVPNI LVI AT E S GML 
YHCWLEGEEEDDHTSEKSWDSRIDLIPSLYVFECVELELALKL 
ASGEDDPFDSDFSCPVKLHRDPKCPSRYHCTHEAGVHSVGLTWI 
HKLHKFLGSDEEDKDSLQELSTEQKCFVEHILCTKPLPCRQPAP 
IRGFWIVPDILGPTMICITSTYECLIWPLLSTVHPASPPLLCTR 
EDVEVAESPLRVLAETPDSFEKHIRSILQRSVANPAFLKASEKD 
IAPPPEECLQLLSRATQVFREQYILKQDLAKEEIQRRVKLLCDQ 
KKKQLEDLSYCREERKSLREMAERLADKYEEAKEKQEDIMNRMK 
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SEQ 
ID 
NO: 


Predicted 
beginning 

niirl pnh i<3*» 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
1 nra t" A on 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=*Aspartic Acid, E» 
uxuLamic. «cia, r— iriienyj.axani.ne , u=ijiycili6; 
H=Histidine, I=Isoleucine, K«=Lysine, 
L=Leucine, M-=Methionine, N»Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W=- Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLLHS FHSEL P VLS DS E RDM KKE LQL I P DQLRHLGNA I KQ VTMK 
KDYQQQKMEKVLSLPKPTIILSAYQRKCIQSILKEEGEHIREMV 
KQINDIRNHVNF 


5944 


167 


3428 


FS I AT FTDE P EVLTE P PS ATTTTT I G I S AT WTTLAGSHGKRNNT 
ITTTSSKRKNRKNKITPENVQIIFDDPLPISYSQPEKVNGESKS 
SSTSESGDSDNMRISSCSDESSNSNSSRKSDNHSPAWTTTVSS 
KKQPSVLVTFPKEERKSVSGKASIKLSETISEGTSNSLSTCTKS 
GPSPLSSPNGKLTVASPKRGQKREEGWKJ^A^RSKKVSVPSTVI 
SRVIGRGGCNINAIREFTGAHIDIDKQKDKTGDRIITIRGGTES 
TRQATQLINALIKDPDKEIDELIPKNRLKSSSANSKIGSSAPTT 
TAANTS LMGI KMTTVALS STS QTATALT VP AI S S AS THKT I KN P 
VN\NVRPGFPVSFP\LAYPPPQFAHALLAAQTFQQIRPPRLPMT 
HFGGTFPPAQSTWGP FPVRPLS PARATNS PKPHMVPRHSNQNS S 
GSQVNSAGSLTSS PTTTTSSS ASTVPGTSTNGS PS S PS VRRQLF 
VTVVKTSNATTTTVTTTASNNNTAPTNATYPMPTAKEHYPVSSP 
SSPSPPAQPGGVSRNSPLDCGSASPNKVASSSEQEAGSPPWET 
TNTRPPNSSSSSGSSSAHSNQQQPPGSVSQEPRPPLQQSQVPPP 

t?\)T)MTVDDT HTCCS "D^ 77V T 7 13 OTA "DTTTVOM Dr\TT3M/*T'Ilrtr>*riTlT/MI?T' 

EiVKrll V ^trLiAiboAlrVAv tfo 1AFV1 x PM rQl PMCjC PQ PTPKMET 
PAIRPPPHGTTAPHKNSASVQNSSVAVLSVNHIKRPHSVPSSVQ 
LPSTLSTQSACQNSVHPANKPIAPNFSAPLPFGPFSTLFENSPT 
SAHAFWGGSWSSQSTPESMLSGKSSYLPNSDPLHQSDTSKAPG 
FRPPLQRPAPSPSGIVNMDSPYGSVTPSSTHLGNFASNISGGQM 
YG PGAP LGGAPAAANFNRQH FS P LSLLTPCS S ASNDS S AQS VS S 
GVRAPSPAPSSVPLGSEKPSNVSQDRKVPVPIGTERSARIRQTG 
TSAPSVIGSNLSTSVGHSGIWSFEGIGGNQDKVDWCNPGMGNPM 
I HRPMS D PG VFSQHQAMERDS TG I VTP S GT FHQHV PAG YMDFP K 
VGGMPFSVYGNAMIPPVAPIPDGAGGPIFNGPHAADPSWNSIjIK 
MVbba IrjJNWOFUi VW l\jPWAPnMNbVnMNQJ_iG 


5945 


1461 


197 


GVTHLFL fgkrklrng iaedlkgqadf ffllvseavvatgspra 
wltcxilplpgiifsvlpkamsrpllitftpatdpsdlwkdgqq 

QPQPEKPESTLDGAAARAFYEAL I GDESS APDSQRSQTEPARER 
KRKKRRIMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRILRAA 
QEGDLPBLRRLLEPHEAGGAGGNTNARDAFWWTPLMCAARAGQG 
AAVSYLLGRGAAWVGVCELSGRDAAQIjAEEAGFPEVARMVRESH 
GETRSPENRS PTPSLQYCENCDTHFQDSNHRTS TAHLLS LSQGP 
QPPNLPLGVPISSPGFKLLLRGGWEPGMGIiGPRGEGRANPIPTV 

REERRREE \ KDRAWERDLRTYMNLEF 


5946 


541 


1666 


ILGSYSSIQPEEYS\SWC\EWLQDLLA\YVSPK\HSYLRDLP 
SEGSPQRVNS IDFV\EL\EHLQPDVLVHAVLRWDF/TI LTEAV 
YSYRGQKQKKVMI.TVEQAQDQHYALVLWGPGAAW\YPQLQRKKG 
YI WEFKYLFVQCNYTLENLELHTTPWSSCECLFDDD I RAITFKA 
KFQKSAPS FVKI SDLATHLEDKCSG WL IKAQ I SELAFP I TASQ 
KI ALNAHS S LKS I FS SLPNI VYTGCAKCGLELETDENRI YKQCF 
SCLPFTMKKIYYRPALMTAIDGRHDVCIRVESKLIEKILLNISA 
DCIi^VIVPSSEITYGMVVADLFHSLIAVSAEPCVLKIQSLFVL 


5947 


3 


1317 


RGIPDRRRRGPIGRVNMDLENKVKKMGLGHEQGFGAPCLKCKEK 
CEGFELHFWRKI CRNC\NVAKKSM/TVLLSNEEDRKVGKLFEDT 
KYTTLIAKLKSDGIP^KRNVMILTNPVAAKKNVSINTVTYEWA 
P P VQNQALARQ YMQMLP KE KQP VAGS EGAQ YRKKQLAKQL P AHD 
QDPSKCHELSPREVKEMEQFVKKYKSEALGVGDVKLPCEMDAQG 
PKQMNIPGGDRSTPAAVGAMEDKSAEHKRTQYSCYCCKLSMKEG 
DPAI YAERAG YD KLWHPAC FVCST CHE LLVDM I YF WKNEKL YCG 
RHYCDSEKPRCAGCDELI FSNEYTQAENQNWHLKHFCCFDCDSI 
IiAGEIYVMVNDKPVCKPCYVKNHAVVCQGCHNAIDPEVQRVTYN 
NFS WHAS TECFLCS CCS KCL I GQKFM P VEGMVFCS VE C KKRMS 


5948 


39 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPIDQ 
GNHYQMRRKGRCHRGSAARHPSSPCSVKHSPTRETLrYAQAQRM 
VE I E IEGRLHRI S I FDPLE 1 1 LEDDLTAQEMS E CNSNKENS ERP 
PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPEPKVRIVEY 
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NO: 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
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corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K-Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P-Proline, Q=Glut amine, R=Arginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SPPSAPRRPPVYYKFIEKSAEELDNEVEYDMDEEDYAWLEIVNE - 
KRKGDCVPAVSQSMFEFLMDRFEKESHCENQKQGEQQSLIDEDA 
VCCICMDGECQNSNVILFCDMCNLAVHQECYGVPYIPEGQWLC/ 
RAHCLQSRAR PADCVLC PNKGGAFKKTDDDRWGHV\ VCALW \ I P 
E \ VG FANTVFI EP IDGVRNT P PARWKLT \ CNL CKE KGR / VGAC I 
QCHKANCYTAFHVTCAQKAGL YM KME PVKELTGGGTT FS VRKTA 
YCDVHTPPGCTRRPLNIYGDVEMKNGVCRKESSVKTVRSTSKVR 
KKAKKAKKALAEPCAVLPTVCAPYI PPQRLNRIANQVAIQRKKQ 
F VE RAHS YWLL KRLS RNGAPLLRRLQ SSLQSQRSSQQRENDEEM 
KAAKEKLKYWQRLRHDLERARLLIELLRKREKLKREQVKVEQVA 
MELRLTPLTVLLRSVLDQLQDKDPARIFAQPVSLKEVPDYLDHI 
KHPMDFATMRKRLEAQGYKNLHE FEEDFDLI IDNCMK YWAP nw 
F YRAAVRLRDQGGWLRQARREVDS IGLEEASGMHLPERPAAAP 
RRPFSWEDVDRLLDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 
SRSKRAKLLKKEIALLRNKLSQQHSQPLPTGPGLEGFEEDGAAL 
GPEAGEEVLPRLETLLQPRKRSRSTCGDSEVEEESPGKRLDAGL 
TNGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNAPKCGRGKPALVRRHTLEDRSELI SCIENGNYAKAARIAAEV 
GQSSMWISTDAAASVLEPLKVVWAKCSGYPSYPALIIDPKMPRV 
PGHHNGVTIPAPPLDVLKIGEHMQTKSDEKLFLVLFFDNKRSWQ 
WLPKSKMVPLGIDETIDKLKMMEGRNSS IRKAVRIAFDRAMNHL 
SR VHG E P TSDLS DID 


5949 


39 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPIDQ '" 
GNHYQMRRKGRCHRGS AARH P S S P CS VKHS PTRETLTYAQAQRM 
VE IE I EGRLHR ISI FDPLE I I LEDDLTAQEMSECNSNKENSERP 
P VCLRTKRHKNNRVKKKNEAL PS AHGTPAS AS AL PE PKVR I VE Y 
SPPSAPRRPPVYYKFIEKSAEELDNEVEYDMDEEDYAWLEIVNE 
KRKGDCVPAVSQSMFEFLMDRFEKESHCENQKQGEQQSLIDEDA 
VCCI CMDGECQNSNVILFCDMCNLAVHQECYGVP YI PEGQWLC / 
RAHCLQSRARPADCVLCPNKGGAFKKTDDDRWGHV\ VCALW \ I P 
E \ VG FANT VF IE P I DG VRN I P PARWKLT \ CNLCKE KGR/ VGAC I 
QCHKANCYTAFHVTCAQKAGLYMKMEPVKELTGGGTTFSVRKTA 
YCDVHTPPGCTRRPLNIYGDVEMKNGVCRKESSVKTVRSTSKVR 
KKAKKAKKAIiAEPCAVLPTVCAPYIPPQRLNRIANQVAIQRKKQ 
FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKEKLKYWQRLRHDLERARLLIELLRKREKLKREQVKVEQVA 
MELRLTPLTVLLRS VLDQLQDKDPAR I FAQP VSLKE VPDYLDH I 
KHPMD FATMRKRLEAQGYKNLHE FEEDFDLI I DNCMKYNARDTV 
FYRAAVRLRDQGGWLRQARREVDS IGLEEASGMHLPERPAAAP 
RRP FS WE D VDRLLD PANRAHLGLEEQLRE LLDMLDLTCAMKS SG 
S RS KRAKLLKKE I ALLRNKL S QQHS QP LP TG PGLEG FE EDGAAL 
GPEAGEE VLPRLETLLQPRKRS RS TCGDS E VEEES PGKRLDAGL 
TNGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNAPKCGRG KPALVRRHTLEDRS EL IS C I ENGNYAKAAR I AAE V 
GQSSMWISTDAAASVLEPLKWWAKCSGYPSYPALIIDPKMPRV 
PGHHNGVTIPAPPLDVLKIGEHMQTKSDEKLFLVLFFDNKRSWQ 
WLPKSKMVPLGIDETIDKLKMMEGRNSS IRKAVRIAFDRAMNHL 
SRVHGEPTSDLSDID 


5950 


1166 


373 


ESRSLTMSTSQPGACPCQGAASRPAILYALLSSSLKAVPRPRSR " 
CLCRQHRPVQLCAPHRTCREALDVLAKTVAFLRNLPS FWQLPPQ 
DQRRLLQGCWGPLFLLGLAQDAVTFEVAEAPVPSILKKILLEEP 
SSSGGSGQLPDRPQPSLAAVQWLQCCLESFWSLELSPKE\YACL 
KGPILFNPDVPGLQAASHIGHLQQEAHWVLCEVLEPWCPAAQGR 
LTR VLLTASTL KS I PTS LLGDL F FRP I IGDVDI AGLLGDMLLLR 


5951 


143 


5449 


WNVKPSLLWQLFKFSDKEEHEQNDSISGKTGETGVEEMIATRK " 

VEQDSKETVKLSHEDDHI LEDAGSSD I SSDAACTNPNKTENS LV 

GLPSCVDEVTECNLELKDTMGIADKTENTLERNKIEPLGYCEDA 

ESNRQLESTEFNKSNLEWDTSTFGPESNILENAICDVPDQNSK 

QLNAIESTKIESHETANLQDDRNSQSSSVSYLESKSVKSKHTKP 

VIHS KQNMTTDAP KKIVAAKYE VIHS KTKVNVKS VKRNTD VPES 

QQNFHRPVKVRKKQIDKEPKIQSCNSGVKSVKNQAHSVLKKTLQ 
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Predicted end 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine*, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, ' 
H=Histidine, I=Isoleucine , KoLysine, 
L=Leucine, M»Methionine, N=Asparagine, 
P=Proline, Q«=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, .*=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DQTLVQI FKPLTHSLSDKSHAHPGCLKEPHHPAQTGHVSHS SQK ' 
QCHKPQQQAPAMKTNSHVKBELEHPGVEHFKEEDKIjKLKKPEKN 
LQ P RQRRS SKSFSLDEPPLFI PDNI AT IRREGS DHS S S FE S K YM 
WTPSKQCGFCKKPHGNRFMVGCGRCDDWFHGDCVGLSLSQAQQM 

geedkeyvcvkccaeedkkteildpdtlenqatvefhsgdktme 

CE KLGLS KHTTNDRTK Y I DDT VKHKVK I L KR E SGEGRNS SDCRD 
NEIICKWQLAPLRKMGQPVLPRRSSEEKSEKIPKESTTVTCTGEK 
ASKPGTHEKQEMKKKKV\EKGVLNVHPAASASKPSADQIRQSVR 

hslkdilmkrltdsnlkvpeekaakvatkiekelfsffrdtdak 
yknkyrslmfnlkdpknnilfkkvlkgevtpdhlirmspeelas 
kelaawrrrenrhtiemiekeqreverrpitkithkgeieiesd 
apmkeqeaameiqepaankslekpegsek\rkeevdsmskdtts 
qhrqhlfdlnckicigrmappvddlspkkvkvwgvarkhsdne 
aesiadalsstsnilaseffeeekqespkstfspaprpempgtv 
evestflarlnfiwkgfinmpsvakfvtkaypvsgspeyltedl 
pdsiqvggrispqtvwdyvekikasgtkeicwrftpvteedqi 
sytllfayfssrkrygvaannmkqvkdmyliplgatdkiphplv 
pfdgpglelhrpnlllgliirqklkrqhsacastshiaetpesa 
ppialppdkkskievsteeapeeendffnsfttvlhkqrnkpqq 
nlqedlptaveplmevtkqeppkplrflpgvligwenqpttlel 
ankplpvddi lqsllgttgqvydq\aqs vmeqntvkb i pflneq 
tns ki ektdnvevtdgenke i kvkvdni sestdksae i ets wg 
sssisagsltslslrgkppdvsteafltnlsiqskqeetveske 

j\ x u JvKvijUJiijy fi.WMijyjjJMy j. bb PCRSNVGKGNTDGNVSCSEN 
LVANTARS PQF INLKRDPRQAAGRSQPVTTS ES KDGDS.CRNGEK 
HMLPGLSHNKEHLTEQINVEEKLCSAEKNSCVQQSDNLKVAQNS 
PSVENIQTSQAEQAKPLQEDILMQNIETVHPFRRGSAVATSHFE 
VGNTCPSEFPSKS ITFTSRSTSPRTSTNFSPMRPQQPNLQHLKS 
SPPGFPFPGPPNFPPQSMFGFPPHLPPPLLPPPGFG\FA\QNPM 
VPWPPW\HLP\GQPQRMMGPLSQASRYIGPQNFYQVKDIRRPE 
RRHSDPWGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKRERHEKE 
WEQESERHRRRDRSQDKDRDRKSREEGHKDKERARLSHGDRGTD 
GKASRDSRNVDKKPDKPKSEDYE KDKERE KS KHREGEKDRDRYH 
KDRDHTDRTKSKR 


5952 


3226 


639 


PPARRSARDLPRAI»SMEAARPSGSWNGALCRLL\LVTL\AFIjIF ' 
ASDACKNVTLHVPSKLDAEKIiVGRVNLKECFTAANLIHSSDPDF 
QILEDGSVYTTNTI LLSSE KRSFTI LLSNTENQE KKKI FVFLEH 
QTKVLKKRHTKE KVLRRAKRRWAP I PCS MLENSLGPFPLFLQQV 
QSDTAQNYTIYYSIRGPGVDQEPRNLFYVERDTGNLYCTRPVDR 
EQYES FE 1 1 AFATTPDGYTPELPLPL 1 1 KIEDENDNYP I FTEET 
YTFTI FENCRVGTTVGQVCATDKDE PDTMHTRLKYS I IGQVP PS 
PTLFSMHPTTGVITTTSSQLDRELIDKYQLK1KVQDMDGQYFGL 
QTTSTCI INIDDVNDHLPTFTRTS YVTS VEENT VD VE ILRVTVE 
DKDLVNTANWRANYT I LKGNENftWP'K' TVTnn VTt<ivr\rr r>\n.rvnr 

NYEEKQQMILQIGWNEAPFSREASPRSAMSTATVTVNVEDQDE 
GPECNPPIQTVRMKENAEVGTTSNGYKAYDPETRSSSGIRYKKL 
TDPTGWVTIDENTGSIKVFRSLDREAETIKNGIYNITVLASDQG 
GRTCTGTLGI ILODVNDNS PF I PKKTVI ICKPTM3 AFTva vno 

DEPIHGPPFDFSLESSTSEVQRMWRLKAINDTAARLSYQNDPPF 
GSYWPirVRDRLGMSSVTSLDVTLCDCITENDCTHRVDPRIGG 
GGVQLGKWAILAILLGIALFFCILFTLVCGASGTSKQPKVIPDD 
LAQQNL I VSNTE APGDDKVYS ANGFTTQTVGAS AQGVCGTVGSG 
IKNGGQETIEMVKGGHQTSESCRGAGHHHTLDSCRGGHTEVDNC 
R YTYS E WHS FTQPRLGEES I RGHTL I KN 


5953 


330 


811 


PLLCNPDPGWYWWVKQESEISKESQEMDARPKLDLGFKEGQTIK 
LCIGNITNKKGGASKPRTARGGGLSLLPPPPGGKVTI PPPSS/ V 
KLPSTNHVTPPSIPKSNHGGSDADILLDLDSPAPVTTPAPTPVS 
VSNDLWGDFSTASSSVPNQAPQPSNWVQF 


5954 


32 


2130 


PPPPPPKIiANMADLEAVLADVSYLMAMEKSKATPAARASKRIVL 
PEPSIRSVMQKYLAERNEITFDKIFNQKIGFLLFKDFCLNEINE 
AVPQVKFYEEIKEYEKLDNEEDRLCRSRQIYDAYIMKELLSCSH 
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residue of 
amino acid 
sequence 


Predicted end 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HsHistidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, V^Valine, 
W=Tryptophan, Y-Tyrosine, X-^ Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








P FS KQAVEHVQS HLS KKQVTS TLFQ P Y I EE I CES LRGD I FQKFM " 
ES DKFTR FCQWKNVE LN I HLTMNE FS VHRI IGRGGFGE VYGCRK 
ADTGKMYAMKCLNKKR I KMKQGETLAIjNER IMLS LVSTGDCPFI 
VCM TYAFHT PDKLC F I LDLMNGGDLHYHLS QHG VFS E KEMRFYA 
TEIILGLEHMHNRFVVYRDLKPANILLDEHGHARIS\DLGLACD 
FSKKKPHASVGTHGYMAPEVLQKGTAYDSSADWFSLGCMLFKLL 
RGHSP FRQHKTKDKHE I DRMTLTVNVELPDTFS PELKSLLEGLL 
QRDVSKRLGCHGGGSQEVKEHSFFKGVDWQHVYLQKYPPPLIPP 
RGEVNAADAFD IGS FDE EDTKG I KLLDCDQELYKNFPLVI S ERW 
QQEVTETVYEAVNADTDKIEARKRAKNKQLGHEEDYALGKDCIM 
HG YML KLGN P FLTQWQRR Y FYLF PNRLEWRGEGE SRQNLLTMEQ 
ILSVEETQIKDKKCILFRIKGGKQFVLQCESDPEFVQWKKELNE 
TFKEAQRLLRRAP KFLNKP RS GTVE LP KP S LCHRNSNGL 


5955 


1726 


444 


KREREFRLAVCPLRYPSAYESSPGTELRECGLCRSGQEFADCRR 
PANRQDVLSGWINLPVLQLTKDPLKTPGRLDHGTRTAFIHHREQ 
VWKRC INI WRD VGLFG VLNE I AN S EE E VFE WVKTASGWALAL CR 
WASSLHGSLFPHLSLRSEDLIAEFAQVTNWSSCCLRVFAWHPHT 
NKFAVALLDDS VRVYNAS S TI VPSLKHRLQRNVAS LAWKPLS AS 
VIAVACQSCILIWTLDPTSLSTRPSSGCAQVLSHPGHTPVTSLA 
WAPSGGRLLSASPVDAAIRVWDVSTETCVPLPWFRGGGVTNLLW 
S P DGS KI LATT P S AVFR VWE AQMWT CER WP TLSGRCQTGCW S PD 
GSRLLFTVLGEPLIYSLSFPERCGEGKG\ALEVQSQQRLWQICIi 
RQQ YRHQMVRRGLGERLT P WS GT P VGNVWL CL 


5955 


1705 


139 


GVGVRGARAMATVQEKAAALNLSAIiHS PAHRPPGFSVAQKPFGA 
TYVWS S 1 1 NTLQ TQ VEVKKRRHRLKRHNDC FVGS E AVDV I FSHL 
IQNKYFGDVDI PRAKWRVCQALMDYKVFEAVPTKVFGKDKKPT 
FEDSSCSLYRFTTIPNQDSQLGKENKLYSPARYADALFKSSDIR 
SASLEDLWENLSLKPANSPHVNISATLSPQVINEVWQEETIGRL 
LQLVDLPLLDS LLKQQEAVP K I PQ P KRQSTMVNS SNYLDRG ILK 
AYSDS QEDEWLS AAIDCS E YLPDQMWE ISRS FPEQPDRTDLVK 
ELLFDAIGRYYSSREPLLNHLSDVHNGIAELLVNGKTEIALEAT 
QLLLKLLDFQNREE FRRLL YFMAVAANP S E FKLQKE S DNRM WK 
RIFSKAIVDNKNLSKGKTDLLVLFL\MDHQKDVFKIPGTL\HKI 
VS \ VK \ LMAI QNGRDPNRDAG Y I YCQR IDQRDYSNNTEKTTKDE 
LLNLLKTLDEDS KLS AKE KKK\ LLGQ F YKCHPDI F I EH FGD 


5957 


1479 


451 


ELQVAVAMDTLDR WKPKT KRAKRFLE KR E PKLNEN I KNAMLI K 
GGNANATVTKVLKDVYALKKPYGVLYKKKNITRPFEDQTSLEFF 
S KKSDCSLFMFGS HNKKRPNNLVTGRMYD YHVLDM I ELG IENFV 
SLKDI KNSKCPEGTKPML I FAGDDFDVTEDYRRLKSLLIDFFRG 
P T VSNI RLAGLE YVLHFTALNG KI YFRS YKLLLKKSGCRTPR I E 
LEEMGPSLDLVLRRTHLASDDLYKLSMICMPKALKPKKKKNISHD 
T FGTT YGR I HMQ KQDLS KLQTR KM\ KGLKKRP AE R I TSDHE KKS 
KR I KKKLME LSQ P LLFHCVLLKR 1 1 KHQS I QS FL 


5958 


1 


3138 


AAALGMLLWFPACQAKNLDVEKIiTVYSGPKGS YFG YAVDFHI PD 
ARTASVLVGAPKANTSQPDIVEGGAVYYCPWPAEGSAQCRQIPF 
DTTNNRKI RVNGTKE P I EFKSNQWFG\ ATVKA\HKGKS CGP VAP 
LLFTWRNFLKPTPEKGPVGTCYVAIQNFSAYAEFSPCGNSNADP 
EGQGYCQAGFSLDFYKNGDLIVGGPGSFYWQGQVITASVADIIA 
NYSFKDILRKLAGEKQTEVAPASYDDSYLGYSVAAGEFTGDSQQ 
ELVAG I PRGAQNFG YVS I INS YDMTFIQNFTGEQMAS YFGYTW 
VS DVNSDGLDDVLVGAPLFMERE FESNPRE VGQI YLYLQVSS LL 
FRDPQILTGTETFGRFGSAMAHLGDLNQDGYNDIAIGVPFAGKD 
QRGKVLIYNGNKDGLNTKPFPKFCQGVWASHAVPSGFGFTLRGD 
SDIDKNDYPDLIVGAFGTGKVAVYRARPVVTVDAQLLLHPMIIN 
LENKTCQVPDSMTSAACFSLRVCASVTGQSIANTIVLMAEVQLD 
SLKQKGAIKRTLFLDNHQAHRVFPLVIKRQKSHQCQDFIVYLRD 
ETEFRDKLS P I NI SLN YS LDES TFKEGLE VKP I LNY YRENI VS E 
QAHILVDCGEDNLCVPDLKLSARPDKHQVIIGDENHLMLI INAR 
NEGEGAYEAELFVMIPEEADYVGIERNNKGFRPLSCEYKMENVT 
RMVVCDLGNPMVSGTNYSLGLRFAVPRLEKTNMSINFDLQIRSS 
NKDNPDSNFVSLQINITAVAQVEIRGVSHPPQIVLPIHNWEPEE 
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Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Un)cnown, *«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








EPHKEEEVGPLVEHIYELHNIGPSTISDTILEVGWPFSARDEFL 
LYIFHIQTLGPLQCQPNPNINPQDIKPAASPEDTPELSAFLRNS 
T I PHL VRKRD VHWE FHRQS P AKI LNCTN I ECLQ I S CAVGRLEG 
GES AVLKVRS RLWAHTFLQRKNDP YALASLVS FEVKKMP YTDQP 
AKLPEGS I AI KTSVI WATPNVS FS I PLWVI ILAILLGLLVLAIL 
TLALWKCGFFDRARPPQEDMTDREQLTNDKTPEA 


5959 


1 


1166 


GTSGYAAQQLPSLLKEREFHLGTLNKVFASQWLNHRQWCGTKC 
NTLFWDVQTSQITKIPILKDREPGGVTQQGCGIHAIELNPSRT 
LLATGGDNPNS LAI YRLPTLDPVCVGDDGHKDWI FS I AW ISDTM 
AVSGSRDGSMGLWEVTDDVLTKS DARHNVS RVP VYAH I THKALK 
DI PKEDTNPDNCKVRALAFNNKNKELGAVSLDG YFHLWKAENTL 
SFCLLSTKLPYCRENVCLAYGSEWSVYAVGSQAHVSFLDPRQPSY 
NVKS VCSRERG S G I RS VS F YEH 1 1 T VGTGQGS LLF YD I RAQRFL 
EERLSACYGSKPRLAGENLKLTTG\KGWLNHDETWRNYFSDIDF 
FPNAVYTHCYDSSGTKLFVAGGPLPSGLHGNYAGLWS 


5960 


2853 


870 


FVWSDGGPRPRRGPAVGAGAAHLSDPWAMTPGTANRATNPLNKB 
LD WAS INGFCEQLNEDFEGPPLATRLLAHKI QS PQEWEAI QALT 
VLETCMKSCX3KRFHDEVGKFRFLNELIKWSPKYLGSRTSEKVK 
NKILELLYSWTVGLPEEVKIAEAYQMLKKQG\IVKSDPKLPDDT 
TF PLP PPRPKNVI FEDEEKSKMLARLLKSSHPEDLRAANKLI KE 
MVQEDQKRMEKISKRVNAIEEVNNNVKLLTEMVMSHSQGGAAAG 
SSEDL\MKEL\YQRCERMRPTLFPTGRVDTEDND\EALAEILiOA 
NDNLTQVINLYKQLVRGEEVNGDATAGS I PGSTS ALLDLSGLDL 
PPAGTTYPAMPTRPGEQASPEQPSASVSLLDDELMSLGLSDPTP 
PSGPSLDGTGWNSFQS S DATEPPAPALAQAPSMESRP PAQTSLP 
ASSGLDDLDLLGKTLLQQSLPPESQQVRWEKQQPTPRLTLRDLQ 
NKSSSCSSPSSSATSLLHTVSPEPPRPPQQPVPTELSLASITVP 
LESIKPSNILPVTVYDQHGFRILFHFARDPLPGRSDVLVWVSM 
LSTAPQPIRNI VFQSAVP KVMKVKLQ P PSGTELPAFNP I VHPSA 
ITQVLLLANPQKEKVRLRYKLTFTMGDQTYNEMGDVDQFPPPET 
WGSL 


5961 


198 


3147 


SGEPRPEPGNMATCIGEKI EDFKVGNLLGKGSFAGVYRAES IHT 

glevaikmidkkamykagmvqrvqnevkihcqlkhpsilelyny 
fedsnyvylvlemchngemnrylknrvkpfsenearhfmhqiit 
gmlylhshgilhrdltlsnllltrnmnikiadfglatqlkmphe 
khytlcgtpnyispeiatrsahglesdvwslgcmfytlligrpp 
fdtd wkntlnkvvladyemprfls ieakdlihqllrrnpadrl 
slssvldhpfmsrnsstkskdlgtvedsidsghatistaitass 
stsisgslfdkrrlligqplpnkmtvfpknksstdfsssgdgns 
fytqwgnqetsnsgrgrviqdaeerphsrylrrayssdrsgtsn 
sqsqaktytmerchsaemlsvskrsgggeneerysptdnnanif 
nffkektssssgsferpdnnqalsnhlcpgktpfpfadptpqte 
tvqqwfgnlqinahlrktteydsispnrdfqghpdlqkdtskna 
wtdtkvkknsdasdnahs vkqqntmkymtalhs kpe i iqqecvf 
gsdplseqsktrgmeppwgyqnrtlrsitsplvahrlkpirqkt 
kkaws i ldse evcvelvke yasqe yvke vlq i s sdgnt i t i yy 
pngg\rgfpla\drppspt\dnisr\ysf\dnlpekywrkyqya 
s rfvqlvrs ks p ki tyftr yakc i lmens pgadfevwf ydgvk i 
hkted f i q vi e ktgks ytlks es evns lkee i km ymdhaneghr 
i clales i iseeerktrsap ffp 1 1 igrkpgsts s pkals p p p s 
vdsnyptrdrasfnrmvmhsaasptqap ilnpsm vtneglgltt 
tasgtd i s sns l kdclpksaqllks vfvknvgwatq \ ltsgavw 
vqfndgsqlwqagvss is yts pngq\ttr\ygeneklpdyi kq 
klqclssillmfsnptpnfh 


5962 


20 


2447 


RVCSS S AS TAS QAVMADAWE E I RRIiAADFQRAQ FAEATQRLS ER 
NCIEIVNKLIAQKQLEWHTLDGKEYITPAQISKEMRDELHVRG 
GRVNIVDLQQVINVDLIHIENRIGDIIKSEKHVQLVLGQLIDEN 
YLDRLAEEVNDKLQES GQVT I S ELCKTYDLPGNFIiTQALTQRLG 
R I ISGHIDLDNRGVI FTEAFVARHKARI RGLFSAITRPTAVNSL 
I S K YG FQEQLL YS VLEEL VNSGRLRGT WGGRQDKAVFVPDI YS 
RTQSTWVDS FFRQNG YLEFDALSRLGI PDAVSYI KKRYKTTQLL 
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ID 
NO; 
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to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine f 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine , M=Methionine , N=Asparagine , 
Pt=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *oStop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FLKAACVGQGLVDQVBASVEEAI SSGTWVD I APLLPTSLSVEDA 
AILLQQVMRAFSKQASTWFSDTVWSEKF\ INDCTEL FRE LMH 

TEGS GS MRGGGGGNARE YK I KKVKKKGRKDDDS DDESQS SHTG K 
KKPE ISFMFQDEI EDFLRKHIQDAPEEFI SELAEYLIKPLNKTY 
LEWRS VFMS STTSASGTGRKRT I KDLQEEVSNL YNNI RLFEKG 
MKFFADDTQAALTKHLLKSVCTD I TNLI FNFLASDLMMAVDDPA 
AITS E IRKKI LS KLSEETKVALTKLHNSLNE KS IEDFI S CLDSA 
AEACDIMVKRGDKKRERQILFQHRQALAEQLKYTEDPALILHLT 
SVLLFQFSTHSMLHAPGRCVPQ I IAFLNSKI PEDQHALLVKYQG 
LWKQLVSQSKKTGQGDYPLNNELDKEQEDVASTTRKELQELSS 

O X IVLSJJ v Jj J\o K J\.o o v 1 a Ei 


5963 


62 


1130 


PWNPQDFPGNRGLMG\QKGEIGPP\GQQGKKGAPGMP\GLMGSN 
GS PGQPGTPGS KGS KGE PG I QGM PGAS GLKGE PGATGS PGE PG Y 
MGLPG I QGKKGD KGNQGE KG I QGQKGENGRQG I PGQQG I QGHHG 
AKGERG EKGE PGVRGA I G S KGE SGVDGLMG PAG P KGQ PGD PG PQ 
GP PGLDGKPGRE FSEQF I RQVCTDVI RAQLP VLLQSGR I RNCDH 
CLSQHGS PGI PG P PGP I G P EGPRG L PGLPGRDGVPGLVG VP GRP 
GVRGLKGLPGRNGE KGSQG FGYPGEQGPPGPPGPEGPPGIS KEG 
PPGDPGLPGKDGDHGKPGIQGQPGPPGICDPSLCFSVIARRDPF 
RKGPNY 


5964 


3 


2147 


SCRTRGRLSPLQPREAGSSRGSRARSEPPRPGGMEEACQVQTTK 
RGDP HE LRNI FLQ YAST E VDGERYMT PEDFVQR YLGLYNDPNSN 
PKIVQLLAGVADQTKDGLISYQEFLAFESVLCAPDSMFIVAFQL 
FDKS GNGE VTFENVKE I FGQT 1 1 HHH I P FNWDCE F I RLHFGHNR 
KKHLN YTE FTQFLQ ELQLEHARQAFAL KDKS KS GM I S GLDFSD I 
MVTIRSHMLTPFVEENLVSAAGGS I SHQVSFS YFNAFNSLLNNM 
ELVRKI YSTLAGTRKDAEVTKEEFAQSAI RYGQATPLEIDI LYQ 

Jb/UJ Li I JMAb^KJj 1 liADI fc,RI APLAEGALPYNLAELQRQQSPGLGR 

p i wlq iaesayrftlgsvagavgatavyp idlvktrmqnqrgsg 
swgelmyknsfdcfkkvlryegffglyrglipqligvapekai 

KIiTVND FVRDKFTRRDGS VPLP AEVLAGG CAGG S QVI FTNPLEI 
VKIRLQVAGEITTGPRVSALNVLRDLGIFGLYKGAKACFLRDIP 
FSAIYFPVYAHCKLLLADENGHVGGLNLLAAGAMAG\VPAASLV 
TPADVIKTRLQVAARAGQTTYSGVIDCFRKIL\REEGPSAFWKG 
TAARVFRSS PQFG \ VTLVT YELLQRG F YI D FGGLKPAGSEPTP K 
SRIADLPPANPDHIGGYRLATATFAGIENKFGLYLPKFKSPSVA 
WQPKAAVAATQ 


5965" 


1 


1498 


MVTWLYRFLPTSNMAAKLRS LLP PDLRLQFWLHARLQKCFLSRG 
CGSYCAGAKASPLPGKMAMGLMCGRRELLRLLQSGRRVHSVAGP 
oWWiAjft.i'iji AKJjijr JrAAFCCLKirHXlj r LtAAb GPRS LSTSAISFA 
EVQVQAPPWAATPS PTAVP EVAS GETAD WQTAAE QS FAELGL 
GS YTP VGL I QNLLE FMHVDLGL P W WGAI AACT VFARCL I FPL IV 
TGQREAARIHNHLPEIQKFSSRIREAKLAGDHIEYYKASSEMAL 
YQKKHG I KLY KPL I LP VTQAP I F I S FF I ALR EMANL PVP S LQTG 
GLWWFQDLTVSDP I YI LPLAVTATMWAVLELGAETGVQS SDLQW 
MRNVIRMMPLITLPITMHFPTAVFMYWLSSNLFSLVQVSCLRIP 
AVRTVLKIPQRVVHDLDKLPPREGFLESFKKGWKNAEMTRQLRE 
REQRMRNQLELAARGPLRQTFTHNPLLQPGKDNPPNIPSS\SSS 
SSKPKSKYPWHDTLG 


5966 


102 


1925 


RSKQVMARLTKRRQADTKAIQHLWAAIEIIRNQKQIANIDRITK 
YMSRVHGMHPKETTROLSLAVKDGliIVETLTVRrKn^irarrTFnpf 

GYWLPGDE IDWETENHDWYCFECHLPGEVLI CDLCFRVYHSKCL 
SDEFRLRDSSSPWQCPVCRSIKKKWTNKQEMGTYLRFIVSRMKE 
RAIDLNKKGKDNKHPMYRRLVHSAVDVPTIQEKVNEGKYRSYEE 
FKAD AQLLLHNTVI F YGADSEQAD IARML YKDTCHE L\ DE LQLC 
KNCFYLANARPDNWFCYPCIPNHELDWAKMKGFGFWPAKVMQKE 
DNQ VDVRF FGHHHQRAW IPS EN I QD I TVN I HRLHVKRSMG WKKA 
CDELELHQRFLREGRFWKSKNEDRGEEEAESSISSTSNEQLKVT 
QEPRAKKGRRNQS VE P KKEE PE PETE AVS S SQE I PTMPQ P I EKV 
S VS TQTKKLS AS S PRMLHRS TQTTNDG VCQ SMCHDKYTKI FNDF 
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Predicted 
beginning 
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location 
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amino acid 
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amino acid 
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Predicted end 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I^Isoleucine , K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, ReArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KDRMKS DHKRETER WREALE KLRS EME E E KRQAVNKAVANMQG 
EMDR KCKQVKEKCKEE FVEE I KKLATQHKQL I S QTKKKQWC YNC 
EEEAMYHCCWNTSYCS IKCQQEHWHAEHKRTCRRKR 


5967 


102 


1925 


RS KQ VMARLTKRRQADTKAI QHLWAAI E 1 1 RNQKQ IAN I DR I TK 
YMSRVHGMHPKETTRQLSLAVKDGLIVETLTVGCKGSKAGIEQE 
G YWLPGDE I DWETENHD W YC FE CHL PGE VL I CDLC FR VYHS KCL 
SDE FRLRDSSS PWQCPVCRS I KKKNTNKQEMGTYLRFI VSRMKE 
RAIDLN1CKGKDNKHPMYRRLVHSAVDVPTIQEKVNEGKYRSYEE 
FKADAQLLLHNTVI FYGADSEQAD IARMLYKDTCHEL \ DELQLC 
KNC F Y LANAR PDNW FC YP C I PNHE LDWAKMKGFGFW P AKVMQKE 
DNQVDVRFFGHHHQRAWI PSENIQDI TVNIHRLHVKRSMGWKKA 
CDEIiEIiHQRFLREGRPWKSKNEDRGEEEAESSISSTSNEQLKVT 
QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
SVSTQTKKLSASSPRMLHRSTQTTNDGVCQSMCHDKYTKIFNDF 
KDRMKSDHKRETERWREALEKLRSEMEEEKRQAVNKAVANMQG 
EMDRKCKQVKEKCKEEFVEEIKKLATQHKQLISQTKKKQWCYNC 
EEEAMYHCCWNTSYCS IKCQQEHWHAEHKRTCRRKR 


5968 


81 


1288 


VRFPRRGGAPPTVLTPGRQQGVFLGPQRPGSEPDIPARGQPHPP 
RPVGVS TS AQAQ VQPPAMHRRRLALGLGFCLLAGTS IiS VLW V YL 
ENWLPVS YVPYYLPCPE I FNMKLHYKREKPLQPWWSQYPQPKL 
LEHRPTQLLTLTPWLAPI VS EGTFNPELLQHI YQPLNLT IGVTV 
FAVGN / HFLESAEE FFMRG YRVHY YI FTDNPAAVPGVPLGPHRL 
LSSIPIQGHSHWEETSMRRMETISQHIAKRAHREVDYLFCLDVD 
MVFRNPWGPETLGDLVAAIHPSYYAVPRQQFPYERRRVSTAFVA 
D S EGD F Y YGGAVFGGQ VARVYE FTRG CHMA ILAD KANG I MAAWR 
E ESHLNRHFISNKP S KVLS P E YLWDDRKPQPP S L KL I RFS TLDK 
DISCLRS 


5969 


1126 


503 


D VG FN I KRKRCD LD VFLE S P RKP SGRRDRAPEKQRR I AAN KCLC 
TGVREGEPPS / TTSQKVKEAGRDFTYL I WLFG I S I TGGL FYTI 
FKELFS S S S PSKI YGRALE KCRSHPE VI GVFGES VKG YGEVTRR 
GRRQHVRFTEYVKDGLKHTCVKFYIEGSEPGKQGTVYAQVKENP 
GSGEYDFR YI FVE IES YPRRTI I IEDNRSQDD 


5970 


316 


4712 


SQDNIGHRLLQKHGWKLGQGLGKSLQGRTDPIPIVVKYDVMGMG 

RMEMELD YAE DATERRRVLE VE KEDTEELRQ KYKDYVDKEKA I A 

KALEDLRANFYCELCDKQYQKHQEFDNHINSYDHAHKQRLKDLK 

QREFARNVSSRSRKDEKKQEKALRRLHEIiAEQRKQAECAPGSGP 

MFKPTTVAVDEEGGEDDKDESATNSGTGATASCGLGSEFSTDKG 

GPFTAVQ ITNTTGLAQAPGLASQGI S FG I KNNLGTPLQKLGVSF 

SFAKKAPVKLESIASVFKDHAEEGTSEDGTKPDEKSSDQGLQKV 

GDSDGSSNLDGKKEDEDPQDGGSLASTLSKLKRMKREEGAGATE 

PEYYHYIPPAHCKVKPNFPFLLFMRASEQMDGDNTTHPKNAPES 

KKGSSPKPKSCIKAAASQGAEKTVSEVSEQPKETSMTEPSEPGS 

KAEAKKALGGDVSDQSLESHSQKVSETQMCESNSSKETSLATPA 

GKESQEGPKHPTGPFFP VLSKDESTALQ WPS ELLI FTKAE P S I S 

YSCNPLYFDFKLSRNKDARTKGTEKPKDIGSSSKDHLQGljDPGE 

PNKSKEVGGEKIVRSSGGRMDAPASGSACSGLNKQEPGGSHGSE 

TEDTG RS LPS KKERSGKSHRHKKKKKHKKS S KHKRKHKADTEEK 

SSKAESGEKSKKRKKRKRKKNKSSAPADSERGPKPEPPGSGSPA 

PPRRRRRAQDDSQRRSIiPAEEGSSGKKDEGGGGSSSQDHGGRKH 

KGE L P P S S CQRRAGTKRSS RSSHRS QPS SGDEDS DDAS S HRLHQ 

KSPSQYSEEEEEEDSGSEHSRSRSRSGRRHSSHRSSRRSYSSSS 

DASSDQSCYSRQRSYSDDSYSDYSDRSRRHSKRSHDSDDSDYAS 

SKHRSKRHKYSSSDDDYSLSCSQSRSRSRSHTRERSRSRGRSRS 

SSCSRSRSKRRSRSTTAHSWQRSRSYSRDRSRSTRSPSQRSGSR 

KRS WGHES PEERHS GRRD F I RS K I YRSQS PH YFRS GRGEG PGKK 

DDGRGDDSKATGPPSQNSNIGTGRGSEGDCSPEDKNSVTAKLLL 

EKIQSRKVERKPSVSEEVQATPNKAGPKLKDPPQGYFGPKLPPS 

LGNKPVLPLIGKLPATRKPNKKCEESGLERGEEQEQSETEEGPP 

GSSDALFGHQFP \SEETTGPLLDPP PEES KSGEVTADHPVAPLG 

PPAHFDCYLGDPTISHNYLPDPSDGNTLESLDSSSQPGPVESSL 

LPIAPDLEHFPSYAPPSGDPSIESTDGAEDA\SLAPLESQPITF 
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Amino acid segment containing signal peptide 
(AnAlanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G^Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, NaAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TPEEMEKYSKXQQAAQQHIQQQLLAKQVKAFPASAALAPATPAL ~ 
QPIHIQQPATASATSITTVQHAILQHHAAAAAAAIGIHPHPHPQ 
PLAQVHHIPQPHLTPISLSHLTHSIIPGHPATPLASHPIHIIPA 
SAIHPGPFTFHPVPHAALYPTLLAPRPAAAAATALHLHPLLHPI 
FSGQDLQHP PSKGT 


5971 


53 


2149 


S FLYFVGVDMDNP I GNWDGRFDGVQLCS FACVES T I LLHIND 1 1 " 
PESVTQERRPPKLAFMSRGVGDKGSSSHNKPKATGSTSDPGNRN 
RS EL F YTLNGS S VDS Q PQS KS KNTWY I DE VAED P AKS LTE I S TD 
FDRSSPPLQPPPVNSLTTENRFHSLPFSLTKMPNTNGS IGHSPL 
SLSAQSVMEEIiNTAPVQESPPIiAMPPGNSHGLEVGSLAEVKENP 
PFYGVIRWIGQPPGLNEVLAGLELEDECAG\CTDGTF/REGTRY 
FTCALKKALF VKLKS CRPDS R FAS LQP VSNQ I ER CNS LAI WEAY 
LSE WEENTP TQKWE KEGLE I M I G \ KKKG I QGHYNS C YLDS TL F 
CLFAFSSVLDTVLLRPKEKNDVEYYSETQELLRTEIVNPLRIYG 
YVCATKIMKLRKILEKVEAASGFTSEEKDPEEFLN1LFHHILRV 
EPLLKIRSAGQKVQDCYFYQIFMEKNEKVGVPTIQQLLEWSFIN 
SNLKFAEAPSCLIIQMPRFGKDFKLFKKIFPSLELNITDLLEDT 
PRQCRICGGLAMYECRE CYDD PD I S AGKI KQFCKTCNTQVHLHP 
KRLNHKYNPVSLPKDLPDWDWRHGCI PCQNMELFAVLCI ETSHY 
VA FVKYG KDDS AWLF FDSMADRDGGQNG FN I PQVT PCP E VGE YL 
KMSLEDLHSLDSRRIQGCARRLLCDAIYVPCTQSPTMSLYK 


5972 


440 


1761 


ILLAGSPSPRDQCSQRQSSGGDKELVTRGCTFSTAWSPSAMTQ 
EPFREELAYDRMPTLERGRODPASYAPnzxtfPSnT.nT citdt ddpu 

SHKTWVFSVLMG S CLLVTS GFSL YLGNVF PAEMD YLRCAAG S C I 
PSAIVSFTVSRRNANVI PNFQI LFVS TFAVTTTCLI W FGCKLVL 
NPS A IN I NFNL I LLLLLE LLMAAT V 1 1 AARS S EEDCKKKKGS MS 
DSANILDEVPFPARVLKSYSWEVIAGISAVLGGIIALNVDDSV 
SGPHLSVTFFWILVACFPSAIASHVAAECPNKCLVEVLIAISSL 
TSPLLFTASGYLSFSIMRIVEMFKDYPPAIKPSYDVLLLLLLLV 
LLLQA/ G PQHGHRHP VRALQGQC KAAG C I LGH P ERPAGAPG WGG 

GQEPPEGVRQGESLESRRGANGPVTPRRGNRVAAPSLAPGMETH 
NP 


5973 


65 


• 2007 


NGDGKDLFGH I WAWRSNGI I SNFRRS PHAGMAEDE PDAKS P KTG " 
GRAP PGGAEAGEPTTLLQRLRGT IS KAVQNKVEG I LQDVQKFSD 
NDKLYLYLQLPSGPTTGDKSSEPSTLSNEEYMYAYRWIRNHLEE 
HTDTCLPKQSVYDAYRKYCESLACCRPLSTANFGKIIREIFPDI 
KARRLGGRGQSKYCYSGIRRKTLVSMPPLPGLDLKGSESPEMGP 
E VTPAPRDELVEAACALTCDWAERI LKRS FSS I VEVapft .t .nnu 
L I S ARS AHAHVLKAMGLAEEDEHAPRE RS S KP KNGL ENPEGGAH 
KKPERLAQPPKDLEARTGAGPLARGERKKSWESSAPGANNLQV 
NALVARLPLLLPRAP RS L I P P I P VS P P I LAPRLS S GALKVATLP 
LS S RAG AP PAAVP I INM I L P TVPALPG PG PGPGRAP PGGLTQ PR 
GTENRE VG I GGDQG PHDKGVKRTAE VPVS EASGQAP PAKAAKQD 
IEDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRL 
PWETWGSGGEGNSAGGAERPGPMGEABKGAVLAQG\QGDGTVSK 
GGRGPGS QHTKEAEDKIPLVPS KVSVIKGSRSQKEAFPLAKGEV 
DTAPQGNKDLKEHVLQSSLSQEHKDPKATPP 


5974 | 


4293 


2200 


LGLQMHTTSGRIHQAMVTSLNEDNESVTVEWIENGDTKGK\E ID " 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV\ASIKNDPPS\RDNRWGSARARPSQFPEQFSSAQQNGSV\S 
DISPVQAAKKEFGPPSRRKSNCVKEVEKLQEKREKRRLQQQELR 
EKRAQDVDATNPNYE I MCMIRDFRGSLDYRPLTTADP IDEHR I C 
VCVRKRPLNKKETQMKDLDVITI PSKDVVMVHEPKQKVDLTRYL 
ENQTFRFD YAFDDS APNEMVYRFTARPLVET I FERGMATCFAYG 
QTGS GKTH TMGGDFSGKNQDCS KG I YALAARD VFLMLKKPNYKK 
LELQVYATFFE I YSGKVFDLLNRKTKLRVLEDGKQQVQ WGLQE 
REVKCVEDVLKLIDIGNSCRTSGQTSANAHSSRSHAVFQIILRR 
KGKLHGKFSL IDLiAGNERGADTS S ADRQTRLEGAEINKSLLALK 
E CI RALGRNKPHTPFRAS KLTQVLRDSF I GENS RTCM I AT I S PG 
MASCENTLNTLRYANRVKELTVDPTAAGDVRPIMHHPPNQI\DD 
LETQ WG VGS S P QRDDLKLLCEQNE EE VS PQL FT FHEAVS QMVEM 
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Predicted end 
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Amino acid segment containing signal peptide 
(A^Alanine, C^Cysteine, D»Aspartic Acid, E« 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N*Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ESQWEDHRAVFQESIRWLEDEKALLEMTEEVDYDVDSYATQLE " 
AI LEQ K I D I LTE LRDKVKS FRAALQEE E Q ASKQ IN P KR PRAL 


5975 


4293 


2200 


LGLQMHTTSGRIHQAMVTSLNEDNESVTVEWIENGDTKGK\EID" 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV\AS I KNDPPS \ RDNRWGS ARARPSQFPEQFS S AQQNGSV\ S 
DISPVQAAKKEFGPPSRRKSNCVKEVEKLQEKREKRRLQQQELR 
E KRAQD VDATNPNYE I MCMI RD FRGS LD YRP LTTADP I DEHR I C 
VCVRKRPLNKKETQMKDLDVITI PSKDVVMVHEPKQKVDIjTRYL 
ENQTFRFDYAFDDSAPNEMVYRFTARPLVETIFERGMATCFAYG 
QTGSGKTHTMGGDFSGKNQDCSKGIYALAARDVFLMLKKPNYKK 
LELQVYATFFEIYSGKVFDLIiNRKTKLRVLEDGKQQVQWGLQE 
REVKCVEDVLKLIDIGNSCRTSGQTSANAHSSRSHAVFQIILRR 
KGKLHGKFSLIDLAGNERGADTSSADRQTRLEGAEINKSLLALK 
E C I RALGRNKPHTP FRASKLTQVLRDSF I GENSRTCM I ATI S PG 
MAS CENTLNTLRYANRVKELTVDPTAAGDVRP I MHHPPNQI \ DD 
LETQWGVGSSPQRDDLKLLCEQNEEEVSPQLFTFHEAVSQMVEM 
EEQWEDHRAVFQES I RWLEDEKALLEMTEEVDYDVDS YATQLE 
AILEQKIDILTELRDKVKSFRAAIiQEEEQASKQINPKRPRAL 


5976 


20 


2949 


VHHLHLTRVSVWNLDI I LRIAQQMGIKTLNLVLd \LKRA\LEF 
P E VS WME VKD PNMKGAMLTNTGK YAI PT I DA\ EAYAI G KKE KP P 
FLPEE PSS SS EEDDP I PDELLCL I CKDIMTDAWI PCCGNS YCD 
E C I RTALLE S DEHTC P TCHQNDVS PDAL I ANK FLRQAVNNFKNE 
TGYTKRLRKQLPSPPPPIPPPRPLIQRNLQPLMRSPISRQQDPL 
MIPVTSSSTHPAPSISSLTSKQSSLAPPVSGWPSSAPAPVPDIT 
ATVSISVHSEKSDGPFRDSDNKILPAAALASEHSKGTSSIAITA 
LMEEKGYQVPVLGTPSLI^fQSLLHGQLIPTTGPVRINTARPGGG 
RPGWEHSNKLG YLVS P PQQ IRRGERS CYRS INRGRHHS ERS QRT 
QGPSLPATPVFVPVPPPPLYPPPPHTLPLPPGVPPPQFSPQFPP 
GQP\PPAGYSVPPPGFPPAPANLSTPWVSSGVQTAHSNTIPTTQ 
APPLSREEFYREQRRLKEEEKKKSKLDEFTNDFAKELMEYKKIQ 
KERRRSFSRSKSPYSGSSYSRSSYTYSKSRSGSTRSRSYSRSFS 
RSHSRSYSRSPPYPRRGRGKSRNYRSRSRSHGYHRSRSRSPPYR 
RYHSRSRSPQAFRGQSPNKRNVPQGETEREYFNRYREVPPPYDM 
KAYYGRS VDFRDP FE KERYREWERKYREWYEKYYKGYAAGAQPR 
P S ANRENFS P ER FLP LNI RNS P FTRGRREDYVGGQS HRSRN I GS 
NYPEKLSARDGHNQKDNTKSKEKESENAPGDGKGNKHKKHRKRR 
KGEESEGFLNPELLETSRKSREPTGVEENKTDSLFVLPSRDDAT 
P VRDE PMDAE S I TFKS VS E KDKRERD K P KAKGDKT KRKNDGS AV 
SKKENIVKPAKGPQEKVDG\DVRDLLDLNL\QLKKPKEETPKDL 
TILNHHLPLRRMKKSL\EPP\EKLTIiNQQK\TPRNKTSQRGKSE 
EGLFQRCQIRKANN 


5977 


1363 


1336 


FLEDRGQVLSHFQCLSLHS INHILHPGAGVAAGPATGW/REYLT 
PVLKES KFKETGV I TPEE FVAAGDHLVHHCPTWQWATGEE LKVK 
AYLPTGKQFLVTKNVPCYKRCKQMEYSDELEAIIEEDDGDGGWV 
DTYHNTG ITG ITEAVKE I TLENKDNIRLQDCSALCEEEEDEDEG 
EAADMEEYEESGLLETDEATLDTRKIVEACKAKTDAGGEDAILQ 
TRTYDLYITYDKYYQTPRLWLFGYDEQRQPLTVEHMYEDISQDH 
VKKT VT I ENHP HL P PP PMCS VH PCRHAE VMKK 1 1 E T VAEGGG EL 
GVHMYLL I FLKFVQAVI PT I E YDYTRHFTM 


5978 


160 


3213 


RDGARRWGGCQ S PLTWAPG F YRRFDLATSGRRLRGQTAE PAGRQ ' 
RPRREPEAMDEQSVESIAEVFRCFICMEKLRDARLCPHCSKLCC 
FSCIRRWLTEQRAQCPHCRAPLQLRELVNCRWAEEVTQQLDTLQ 
LCSLTKHEENEKDKCENHHEKLSVFCWTCKKCICHQCALWGGMH 
GGHTFKPLAE I YE QHVTKVNE E VAKLRRRLME LI S LVQE VE RNV 
EAVRNAKDERVRE I RNAVEMM I ARLDTQLKNKLI TLMGQKTS LT 
QETELLESLLQEVEHQLRSCSKSELISKSSEILMMFQQVHRKPM 
AS FVTTPVPPDFTS ELVPS YDS ATFVLENFSTLRQRADP VYS PP 
LQVSGLCWRLKVYPDGNGWRG YYLS VFLELS AGLPETS KYEYR 
VEMVHQS CND PTKN 1 1 REFAS DFE VGE CWG YNRFFRLDLLANEG 
YLNPQNDTVILRFQVRSPTFFQKSRDQHWYITQLEAAQTSYIQQ 
INNLKERLTIELSRTQKSRDLSPPDNHLSPQNDDALETRAKKSA 
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NO: 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
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corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, x=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CSDMLLER\GPYSAS\VREAKEDEEDEEKIQNEDYHHELSDGDL 
DLDLVYEDEVNQLDGSSSSASSTATSNTEENDIDEETMSGENDV 
EYNNMELEEGELMEDAAAAGPAGSSHGYVGSSSRISRRTHLCSA 
ATSSLiLDIDPLILIHLLDLKDRSSIENLWGLQPRPPASLLQPTA 
SYSRKDKDQRKQQArWRVPSDLKMLKRLKTQMAEVRCMKTDVKN 
TL S E I KS S S AAS GDMQTS L FS ADQ AALAACGTENS GRLQDLGME 
LLAKSSVANCYIRNSTNKKSNSPKPARSSVAGSLSLRRAVDPGE 
NSRSKGDCQTLSEGSPGSSQSGSRHSSPRALIHGSIGDILPKTE 
DRQ CKALD S DAWVAVFS GLPAVE KRRKMVTLGANAKGGHLEGL 
QMTDLENNSETGELQPVLPEGASAAPEEGMSSDSDIECDTENEE 
QEEHTSVGGFHDS FMVMTQPPDEDTHSSFPDGEQIGPEDLS FNT 
DENSGR 


5979 


212 


3665 


LPDMTMYLWIiKLLAFGFAFLDTEVFVTGQSPTPSPrDAYLNASE 
TTTLS P SGS AVI S TTT I ATTPSKPTCDEKYANI TVDYLYNKETK 
L FTAKLNVNENVE CGNNTCTNNE VHNLTEC KNAS VSISHNS CTA 
PDKTLILDVPPGVEKVPVHCCS\QVEQPDSTIWLKWKNIETSTC 
DTQNITYRFQCGNMI FDNKE I KLBNLEPEHE YKCDSEILYNSHK 
FTNASKI IKTDFGSPGEPQI I FCRSEAAHQGVITWNPPQRS FHN 
FTLCYI KETEKDCLNLDKNLI KYDLQNLKP YTKYVLSLHAY 1 1 A 
KVQRNGS AAM CHFTT KS AP P SQVWNMTVSMTSDN S MHVKCRP P R 
DRNG PHER YHLE VE AGNT LVRNE SHKNCDFRVKDLQ YS TD YT FK 
AYFHNGDYPGEP F I LHHSTS YNS KAL I AFLAFL 1 1 VTS I ALLW 
L Y K I YDLHKKRS CNLDEQQE L VE RDDEKQLMNVE P I HAD I LLET 
YKRKI ADEGRLFLAEFQS I PRVFSKFPI KEARKPFNQNKNRYVD 
I L P YD YNRVE LS E I NGD AGS N YINAS Y I DG FKE PR KYI AAQG PR 
DETVDDFWRMIWEQKATVIVMVTRCEEGNRNKCAEYWPSMEEGT 
RAFGECCCKDLTKHKRCP\DYIIQKLNIVNKKEKATGREVTHIQ 
FTSWPDHGVPEDPHLLLKLRRRVNAFSNFFSGPIWHCSAGVGR 
TGTYIGIDAMLEGLEAENKVDVYGYWKLRRQRCLMVQVEAQYI 
LIHQALVEYNQFGETEVNLSELHPYLHNMKKRDPPSEPSPLEAE 
FQRLPSYRSWRTQHIGNQE\ENKSKNRNSNVIPYDYNRVPLKHE 
LEMSKESEHDSDESSDDDSDSEEPSKYINASFIMSYWKP\EVMI 
AAQGPLKETIGDFWQMIFQRKVKVIVMLTELKHGDQEICAQYWG 
EGKQTYGDI EVDLKDTDKS S T YTLRVF ELRHS KRKDS RTVYQ YQ 
YTNWSVEQLPAEPKELISMIQVVKQKLPQKNSSEGNKHHKSTPL 
LIHCRDGSQQTGI FCALLNLLES AETEEWDI FQWKALRKARP 
GMVSTFEQYQFLYDVIAS TYPAQNGQVKKNNHQEDKI EFDNEVD 
KVKQDANCVNPLGAPEKLPEAKEQAEGSEPTSGTEGPEHSVNGP 
ASPALNQGS 


5980 


3 


2363 


DAWG CKLRRLRFT YGTQTR VS LALPGQ YEL VHTL VAHQGNWET I 
PEEDLBVQENNEDAAHDLTELEVTMHHALLQEVDWVAPCQGLR 
PT VD VLGDLVND FL P VIT YALHKDELS E RDEQELQE I R KYFS FP 
VFFFKVPKIiGSEIIDSSTRRMESERSPLYRQLIDLGYLSSSHWN 
CGAPGQDTKAQSMLVEQSEKLRHLSTFSHQVLQTRLVDAAKALN 
LVHCHCLD I F INQAFDMQRDLQ I TP KRLE YTRKKENELYES LMN 
IANRKQEEMKDMIVETLNTMKEELLDDATNMEFKDVIVPENGEP 
VGTRE I KCCIRQIQELI ISRLNQAVANKLI SSVDYLRESFVGTL 
ERCLQSLEKSQDVSVHITSNYLKQILNAAYHVEVTFHSGSSVTR 
MLWEQI KQ 1 1 QR ITWVS PPAI TLE WKRKVAQEAIESLSAS KLAK 
S I CSQFRTRLNSSHEAFAASLRQLEAGHSGRIjEKTEDLWIiRVRK 
DHAPRLARLSLESRSLQDVLLHRKPKLGQELGRGQYGWYLCDN 
WGGHFPCALKSWPPDEKHWNDLALEFHYMRSLPKHERLVDLHG 
SVIDYNYGGGSSIAVLLIMERLHRDLYTGLKAGLTLETRLQIAL 
DWEG IRFLHSQGLVHRDIKLKNVLLDKQNRAKITDLGFCKPEA 
MMSGSIVGTPIHMAPELFTGKTDNSVDVYAFGILFWYICSGSVK 
LPE AFERCAS KDHLWNNVRRGAR PERLP VFDEECWQLMEAC WDG 
DPLKRPLLGIVQPMLQGIMNRLCKS\NSEQPNRGLDDST 


5981 


1 


2519 


GRRHSAAMERPWGAADGLSRWPHGLGLLLLLQLLPPSTLSQDRL 
DAPP PPAAPLPRWSGP IGVS WGLRAAAA\GGAFPRGGRWRRSAP 
G\EDEECGRVRDFVAKLANNTHQHVFDDLRGSVSLSWVGDSTGV 
ILVLTTFHVPLVIMTFGQSKLYRSEDYGKNFKDITDLINNTFIR 
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to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
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amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, OCysteine, D«Aspartic Acid, E=* 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L»Leucine, M^Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine / 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TEFGMAIGPENSGKVVLTAEVSGGSRGGRIFRSSDFAKNFVQTD 
LPFHPLTQMMYSPQNSDYLLALSTENGIjWSKNFGGKWEEIHKA 
VCLAKWGSDNTIFFTTYANGSCKADLGALBLWRTSDLGKSFKTI 
GVKIYSFGLGGRFLFASVMADKDTTRRIHV5TDQGDTWSMAQLP 
SVGQEQFYS ILAANDDMVFMHVDEPGDTGFGTI FTSDDRGI VYS 
KSLDRHLYTTTGGETDFTNVTSLRGVYITSVLSEDNSIQTMITF 
DQGGRWTH LRKPBNS ECDATAKNKNE CS LH I HAS YS I SQ KLNVP 
MAPLSEPNAVGIVIAHGSVGDAISVMVPDVYISDDGGYSWTKML 
EGPHYYTILDSGGIIVAIEHSSRPINVIKFSTDEGQCWQTYTFT 
RDPIYFTGLASEPGARSMNISIWGFTESFLTSQWVSYTIDFKDI 
LERNCEEKDYTIWLAHSTDPEDYEDGCILGYKEQFLRLRKSSVC 
QNGRDYVVTKQPSICLCSLEDFLCDPGYYRPENDSKCVEQPELK 
GHDLEFCLYGREEHLTTNGYRKIPGDKCQGGVNPVREVKDLKKK 
CTSNFLS P E KQNS KS NS VP 1 1 LA I VGLMLVT WAGVL I VKKYVC 

GGRFLVHLYSVLQQH\AEA\NGVDGVDALDTASHTNKSGYHDDS 
DEDLLE 


5982 




2316 


ATRPPRGSSWCRQFSRTASAAPGRSNMLRIPVRKALVGLSKSPK " 
GCVRTTATAASNLIEVFV1XK3SVMVEPGTTVLQACEKVGMQIPR 
FCYHERLSVAGNCRMCLVEIEKAPKWAACAMPVMKGWNILTNS 
EKS KKAREG VME FLLANHPLDCP I CDCX^GECDLQDQSMMFGNDR 
SRFLEGKRAVEDKNIGPLVKTIMTRCIQCTRCIRFASEIAGVDD 
LGTTGRGNDMQVGTYIEKMFMSELSGNIIDICPVGALTSKPYAF 
TARPWETRKTESIDVMDAVGSNIWSTRTGEVMRILPRMHEDIN 
E E W I SDKT RFAYDGLKRQRLTE PMVRNE KG LLT YTS WEDALS RV 
AGMLQSFQGKDVAAIAGGLVDAEALVALKDLLNRVDSDTLCTEE 
VFPTAGAGTDLRSNYLLNTriAGVEEADWLLVGTNPRFEAPIiF 
NAR IRKS WLHNDLKVAL IG SPVDLT YT YDH LGDS PKI LQD I ASG 
SH P FS Q VLKEAKKPMWLG SS ALQRNDGAA I LAAVS S IAQK I RM 
TSGVTGDWKVMNILHRIASQVAALDLGYKPGVEAIRKNPPKVLF 
LLGADGGC I TRQDLPKDCF 1 1 YQGHHGDVGAP I ADVI LPGAAYT 
E KS AT YVNTEGRAQQTKVAVTP PG LARED WK I IRALSEIAGMTL 
P YDTL \ DQ VRNR LEE VS PNLVRYDD I EG \ ANYFQQANELS KL VN 
QQI.LADPLVPPQLTMKDFYMTDS ISRASQTMAKCVKAVTEGAQA 
VEEPSIC 


5983 


248 


1763 


EARGDGGRRRHRASGRRAGRGEP \AGLKSQGQRAVPKRAVARGG 
RQ\YSAAIALLEPAGSEIADDLSILYSNfRAACYLKEGNCSGCIQ 
DCNRALELHPFSMKPLLRRAMAYETLEQYGKAYVDYKTVLQIDC 
GLQLANDSVNRLSRILMELDGPNWREKLSLIPAVPASVPLQAWH 
PAKEMISKQAGDSSSHRQQGITDEKTFKALKEEGNQCVNDKNYK 
DAL S K YS ECLKI NNKECAI YTNRALC YL KLCQFE EAKQDCDQAL 
QLADGNVKAFYRRALAHKGLKNYQKSLIDLNKVILLDPSI IEAK 
MELEEVTRLLNLKDKTAPFNKEKERRKIEIQEVNEGKEEPGRPA 
GEVSTGCLASEKGGKSSRSPEDPEKLPIAKPNNAYEFGQIINAL 
S TRKD KEACAHLLA I TAP KDLPMFLSN KLEGDTFLLLI QS LKNN 
LIEKDPSLVYQHLLYLSKAERFKMMLTLISKGQKELIEQLFEDL 
SDTPNNHFTLED IQALKRQYEL 


5984 


755 


1193 


SSVCMACTWSNIjGKKQRSVSFLASGLMRVSTGPELRLHHSFVL 
TGDVGRRI CRLLVGLFTKGDTSS KRVHPFS PGPCFLLCDLARVG 
SS PKINVS PFYQN\QTSTQRSCTVFVWQRCSLVGPFQVTVFTMY 
FHHSLRS I S RFS SG 


5985 


22 


1408 


RRVARPGTAEPAKARRTVRRGRARRDLAGAERKAGVSERGDSGR 
RRPNPS I PSAAAGMSHIQ I PPGLTELLQGYTVEVLRQQPPDLVE 
FAVEYFTRLREARAPASVLPAATPRQSLGHPPPEPGPDRVADAK 
GDSESEEDEDLEVPVPSRFNRRVSVCAETYNPDEEEEDTDPRVI 
HPKTDEQRCRLQEACKDILLFKWLDQEQLSQVLDAMFERrVKAD 
EHVIDQGDDGDNFYVI ERGT YDI LVTKDNQTRS VGQYDNRGS FG 
ELALMYNTPRAATI VATSEGSLWGLDRVTFRRI I VKNNAKKR KM 
FESFIESVPLUCSLEVSERMKIVDVIGEKIYKR/DGERIITQGE 
K\ADSFYIIESGEVSILIRSRTKSNKDGGNQEVEIARCHKGQYF 
GEI4ALVTNKPRAASAYAVGDVKCLVMDVQAFERLLGPCMDIMKR 
N I S HYE E Q L VKM FGS S VD LGNLGQ 
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beginning 
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residue of 
amino acid 
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Predicted end 
nucleotide 
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corresponding 
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residue of 
amino acid 
sequence 


™ ,W - 11W «<-xu =>egmeiit. containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RsArginine, 
S=Serine, T=Threonine, VsValine, 
W= Tryptophan , Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5986 


1806 


484 


DAWKSTS LTFHWKLWGRHRGRRRGLAHP KNHLSPQQGGATPQ VP 
S PCCRFDS PRGP P P PRLGLLGALMAEDGVR GS PPVP DDMT?T?n 
GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
LISNVCSIGDHVAQELFQGSDLGMAEEAERPGEK\AGQHSPLRE 
EHVTCVQSILDEFLQT\YGSLIPLSTDEWEKLEDIFQQEFSTP 
S RKGLVLQL IQ S YQRMP GNAMVRG FRVAYKRHVLTMDDLGTL YG 
QNWLNDQVMNM YGDLVMDTVPEK\ VHFFNS FFY \DKLRTKGYDG 
VKRWTKNVD I FNICELLL I PIHLEVHWSLISVDVRRRTI TYFDSQ 
RTLNRRCPKHIAKYLQAEAVKKDRLDFHQGWKGYFKMNVARQNN 

DSDCGAFVLQYCKHLALSQPFSFTQQDMPKLRRQIYKELCHCKL 
TV 


5987 


1806 


484 


DAWKS TSLTFHWKLWGRHRGRRRGLAHP KNHLS PQQGGATPQVP 

GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
L I SNVCS IGDHVAQELFQGSDLGMAEEAERPGEK\AGQHS PLRE 
EHVTCVQSILDEFLQT\YGSLIPLSTDEVVEKLEDIFQQEFSTP 
SRKGLVLQLI QS YQRMPGNAMVRGFRVAYKRHVLTMDDLGTLYG 
QNWIjNDQVMNMYGDL VMDTVPEK\ VHFFNS FFY \DKLRTKGYDG 
VKRWTKNVD I FNKELLL I P IHLE VHWS LIS VD VRRRT I TYFDSQ 
RTLNRRC P KH I AKYLOAE AVKKDP T >npunrw vc* v pvmtnh t> rwmt 

DSDCGAFVLQYCKHLALSQPFSFTQQDMPKLRRQIYKELCHCKL 
TV 


5988 


1292 


410 


FKKYFLS FLGLLE S SH S RDR I HNL VLM FLLATHNL VWW FTCRFQ 
RLDCIYLNAGIMPNPQLNIKALLFGLFS\AEGLLTQGDKITADG 
LQEVFETDVFGHFILIRELEPLLCHSDNPSQLIWTSSRNARKSN 
FSLEDFQHSKGKEPYSSSKYATDLLSVALNRNFNQQGLYSNVAC 
PGTALTNLTYGILPPFIWTLLMPAILLLRFFANAFTLTPYNGTE 
ALVWLFHQKPESLNPLIKYLSATTGFGRNYIMTQKMDLDEDTAE 
KFYQKLLELEKHIRVTIQKTDNQARLSGSCL 


5989 


194 


2610 


AMDFPQHSQHVLEQLNQQRQLGLLCDCTFVVDGVHFKAHKAVLA 
ACSE YFKMLFVDQKDWHLD I SNAAGLGQVLEFM YTAKLS LS PE 
NVDDVL\AVATFLQMQDI ITACHALKSLAEPATSPGGNAEALAT 
EGGDKRAKEEKVATSTLSRLEQAGRSTPIGPSRDLKEERGGQAQ 
S AASGAEQTE KADAPR E P P P VEL KPD PTSGMAAAEAEAAL S E S S 
EQEMEVEPARKGEEEQKEQEEQEEEGAGPAEVKEEGSQLENGEA 
PEENENEESAGTDSGQELGSEARGLRSGTYGDRTESKAYGSVIH 
KCEDCGKEFTHTGNFKRH I R I HTGEKPFS CRECS KAFS DPAACK 

AHEKTHSPLKPYGCEECGKSYRLISLLNLRKKRHSGEARYRCED 
CGKLFTTS GNLKRHO LVHSGK KPYnrnvrfiPQ PQn dt c vmtj lit t? 

THDTDKEHKCPHCDKKFNQVGNLKAHLKXHIADGPLKCRECGKQ 
FTTSGNLKRHLRIHSGEKPYVCIHCQRQFADPGALQRHVRIHTG 
EKPCQCVMCGKAFTQASSLIAHVRQHTGEKPYVCERCGKRFVQS 
SQLANHIRHHDNIRPHKCSVCSKAFVNVGDLSKHI I IHTGEKPY 
LCDKCGRGFNRVDNLRSHVKTVHQGKAG I KILEPEEGSEVS WT 
VDDMVTLATEALAATAVTQLTVVP VGAAVTAD E TE VLKAEI S KA 
VKQVQEEDPNTHILYACDSCXSDKFXiDANSLAQHVRIHTAQALVM 
FQTDAD F YQQ YGPGGTWPAGQVLQAG EL VFRP RDGAEGQ PALAE 
TSPTAPECPPPAE 


5990 


2 


4700 


FGPGPDSGGGARGSGWGSRSQAPYGTLGAVSGGEQVLLHEEAGD 
SGFVSLSRLGPSLRDKDLEMEELMLQDETLLGTMQSYMDASLIS 
LIEDFGSLGEVEMSLPDPSWDFSPPSFLETSSPKLPSWRPPRSR 
PRWGQSPPPQQRSDGEEEEEVASFSGQILAGELDNCVSSIPDFP 
MHliACPEEEDKATAAEMAVPAAGDESISSLSELVRAMHPYCLPN 
LTHLASLEDELQEQPDDLTLPEGCWLEIVGQAATAGDDLEIPV 
WRQVSPGPRPVLLDDSLETSSALQLLMPTLESETEAAVPKVTL 
CS E KEGLS LNS EE KLD S ACLLKPRE WE P WP KE PQNP PANAAP 
GSQRARKGRKKKSKEQPAACVEGYARRLRSSSRGQSTVGTEVTS 
QVDNLQKQPQEELQKESGPLQGKGKPRAWARAWAAALENSSPKN 
LERSAGQSSPAKEGPLDLYPKLADTIQTNPIPTHLSLVDSAQAS 
PMPVDSVEADPTAVGPVLAGPVPVDPGLVDLASTSSELVEPLPA 
EPVLINPVLADSAAVDPAWPISDNLPPVDAVPSGPAPVDLALV 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S= Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DPVPNDLTPVDPVLVKSRPTDPRRGAVSSALGGSAPQLLVESES 
LDPPKTIIPEVKEWDSLKIBSGTSATTHEARPRPLSLSEYRRR 
RQQRQAETEERSPQPPTGKWPSLPETPTGLADIPCLVIPPAPAK 
KTALQRSPETPLEICLVPVGPSPASPSPEPPVSKPVASSPTEQV 
PSQEMPLLARPSPPVQSVSPAVPTPPSMSAALPFPAGGLGMPPS 
LPPPPLQPPSLPLSMGPVLPDPFTHYAPLPSWPCYPHVSPSGYP 
CLPPPPTVPLVSGTPGAYAVPPTCSVPWAPPPAPVSPYSSTCTY 
GPLGWGPGPQHAPFWSTVPPPPLPPASIGRAVPQPKMESRGTPA 
GPP ENVL P LS MAP PLS LGLPGHGAPQTE P T KVE VKP VP AS PHP K 
HKVSALVQSPQMKALACVSAEGVTVEEPASERLKPETQETRPRB 
KPPLPATKAVPTPRQSTVPKLPAVHPARLRKLSFLPTPRTQGSE 
DWQAFISEIGIEASDLSSLLEQFEKSEAKKECPPPAPADSLAV 
GNSGGVDI PQEKRPLDRLQAPELANVAGLTPPATPPHQLWKPLA 
AVSLLAKAKS P KSTAQEGTLKP EGVTEAKH PAAVRLQEGVHGPS 
RVHVGS GDHD YC \ VRS RT P P KK\ MP ALL I P EVGSRWNVKRHQD I 
TIKPVLSLGPAAPPPPCIAASREPLDHRTSSEQADPSAP CLAPS 
S LLS P EAS PCRNDMNTRTP P E PS AKQRSMRC YRKACRS AS PS SQ 
GWQGRRGRNSRSVSSGSNRTSEASSSSSSSSSSSRSRSRSLSPP 
HKRWRRSSCSSSGRSRRCSSSSSSSSSSSSSSSSSSSSRSRSRS 
PSPRRRSDRRRRYSSYRSHDHYQRQRVLQKERAIEERRWFIGK 
I PGRMTRSELKQRFSVFGE I EECTIHFRVQGDNYGFVTYRYAEE 
AFAAIESGHKLRQADEQPFDLCFGGRRQFCKRSYSDLDSNREDF 
D P AP VKS KFDS LDFDTLLKQAQKNLRR 


5991 


334 


1379 


RLSSHFSQCSPSIYC \ TKFDKQGNVTS FE RKKTE L YQE LGLQAR 
DLRFQHVMS I T VRNNR I I MRME YLKAVI T P ECLL I LD YRNLNLK 
QWLFRELPSQLSGEGQLVTYPLPFEFRAIEALLQYWINTLQGKL 
SILQPLILBTLDALGDPKHSSVDRSKLHILLQNGKSLSELETDI 
. K I FKE S I LB I LDE EELLEELC VS KWS D PQ VFE KS SAG I DHAEEM 
ELLLENYYRLADDLSNAARELRVLIDDSQS 1 1 F INLDSHRNVMM 
RLNLQLTMGTFSLSLFGLMGVAFGMNLESSLEEDHRIFWLITGI 
MFMGSGLIWRRLLSFLGR/ LARS S IAS YGMKDMVHGGIVEGL 


5992 


2 


609 


AGPDFRLVCGVSGSGFPGGRQGQATEWRPLRPWNGAMEKLRRVL 
SGQDDEEQGLTAQDSQINL/ SEVLDASSLSFNTRLKWFAICFVC 
GVFFS ILGTGLLWLPGG I KLFAVF YTLGNLAALASTCFLMGP VK 
QLKKMFEATRLLATIVMLLCFIFTLCAALWWHKKGLAVLFCILQ 
FLS MTWYSLSYI P YARDAVI KCCS SLLS 


5993 


1650 


594 


AEGLGSWAVWAGLGWAGRHMEAGGATGALGVGCKLPSAFCFPGS 
S VAMDMFQKVE K I GEGT YG WYKAKNRE TGQLVALKK I RLDLEM 
EGVPSTAIREISLLKELKHPNIVRLLDWHNERKLYLVFEFLSQ 
DLKKYMDSTPGSELPLHLIKSYLFQLLQGVSFCHSHRVIHRDLK 
PQNLLINELGAI KLADFGLARAFG VPLRTYTHE WTLWYRAPEI 
LLATRF YTTAVDI WS IGC I FAEMVTRKALFPGDS \EIDQ\LFRI 
FRMLGTPSEDTWPGVTQLPDYKGSFPKWTRKGLEEIVPNLEPEG ' 
RDLLMQLLQYDPSQRITAKTALAHPYFSSPEPSPAARQYVLQRF 
RH i 


5994 


394 


1934 


AGE VQLH VW I RG MRI QPQ/ KAAA 1 1 DL DPD FE PQS R PRS CTW PL 
PR P E I ANQPS KP PE VEPDLGEKVHTEGRSE P ILLPSRLPE P AGG 
PQ PG I LGAVTG PRKGGS RRNAWGNQS YAEL IS QAI ESAPEKRLT 
LAQI YEWMVRTVPYFKDKGDSNSSAGWKNS IRHNLSLHSKFI KV 
HNEATGKSSWWMLNPEGGKSGKAPRRRAASMDS S S KLLRGRS KA 
PKKKPSGLPAPPEGATPTSPVGHFAKWSGSPCSRNREEADMWTT 
FR PRS S S NAS S VSTRLS PLR PE SEVLAE E I PAS VS S YAGGVP P T 
LNEGLELLDGLNLTSSHSLLSRSGLSGFSLQHPGVTGPLHTYSS 
SLFSPAEGPLSAGEGCFSSSQALEALLTSDTPPPPADVLMTQVD 
PILSQAPTLLLLGGLPSSSKLATGVGLCPKPLEAPGPSSLVPTL 
SM I AP PP VMAS AP I P KALGT P VLTP PTE AASQDRM PQDLDLDMY 
MENLECDMDNI ISDLMDEGEGLDFNFEPDP 


5995 


2 


2437 


RPPGPGPASGAWLCTRARGSAAFVPPLPRPPSRGARRRRRLPGR 
GVAALRRGPGSAPGLPRGRAERSAAGSGRGPSREERGAAAAAAA 
AEMMEELHSL\DP\RRQELLEARF\TGLGVSKGPLNSESSNQSL 
CSVGSLSDKEVETPEKKQNDQRNRKRKAEPYETSQGKGTPRGHK 
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ID 
NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, ^Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ISDYFERRVEQPLYGLDGSAAKEATEEQSALPTLMSVMIAKPRL 
DTEQLAQRGAGLCFTFVSAQQNSPSSTGSGNTEHSCSSQKQISI 
QHRQT \ QS D LT I E KI SALENS KNS DLE KKEGR I D DLLRANCDLR 
RQl\DEQQKMLEKYK\ERLNRCFDNEPRNFLIEKSKQEKMACRD 
KSMQDRLRLGHFTTVRHGAS FTEQWTDGYAFQNL I KQQ3RINS Q 
REE IERQRKMLAKRKP PAMGQAPPATNEQKQRKSKTNGAENETL 
TLAEYHEQEE I FKLRLGHLKKEEAE I QAELERLERVRNLHIREL 
KR I HNEDNSQF KDHPTLNDR YLLLHLLGRGGFS EVYKAFDLTEQ 
R Y VAVK IHQLN KNWRD E KKENYHKHACRE YR I HKELDHPRI VKL 
YDYFSLDTDS FCTVLEYCEGNDLDFYLKQHKLMSEKEARS I IMQ 
I VNALKYLNE I KPPI IHYDLKPGNI LLVNGTACGE I KI TDFGLS 
KIMDDDSYNSVDGMELTSQGAGTYWYLPPECFVVGKEPPKISNK 
VDVWSVGVIFYQCLYGRKPFGHNQSQQDILQENTILKATEVQFP 
PKPWTPEAKAFIRRCLAYRKEDRIDVQQLACDPYLLPHIRKSV 
S TS S P AGAAI AS TS GAS NNS S SN 


5996 


1*12 


981 


DQQACLLGLMIjTLE FGILEFDPSWIGSWTQR/S W VS WR S RPG ce 
LFSIWFGS I VNEGYLNSASEGEEFCI YNRNPNACS YGVAVGVL 
AFLTCLLYLALD VYFPQ I S SVKDRKK\ AVLSGHP WSGE PHPAA 
FWAFLWFTGDSCYL\ANQWQVSKPKDNPLNEGTDASPGRPSPFS 
FFSIFTWSLTAALAVRRFKDLSFQEEYSTLFP\ASAQP | 


5997 


1612 


981 


DQQACLLGLMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFS I WFGS I VNEGYLNSASEGEE FC I YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQ I S SVKDRKK\ AVLSGHPWSGE PHPAA 
FWAFLWFTGDS C YL\ ANQWQVSKP KDNPLNEGTDAS PGRPS PFS 
FFS I FTWSLTAALAVRRFKDLSFQEEYSTLFP\ASAQP 


5998 


1612 


981 


DQQACLLGLMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFS I WFGS I VNEGYLNSASEGEE FC I YNRNPNACS YGVAVGVL 
AFLTCLL YLALD VYF PQ I S S VKDRKK\ AVLS GH P WS GE PH PAA 
FWAFLWFTGDSC YL\ANQWQVS KPKDNPLNEGTDAS PGR PS PFS 
FFS IFTWSLTAALAVRRFKDLSFQEE YSTLFP \ ASAQ P 


5999 


2 


1790 


RP PM E KARRGGDG VPRG P VLH I VWGFHHKKGCQVE FSYPPLIP 
GDGHDSHTLPEEWKYLP FLALPDGAHN YQEDTVFFHLP PRNGNG 
AT VFG I S C YR \Q I EAKALKVRQAD I TRETVQ KS VCVLS KL PL YG 
LLQAKLQL I THA YFEE KDFS Q I S I LKEL YEHMNS S LGGAS LEGS 
QVYLGLS PRDLVLHFRHKGL I LFKL I LLEKKVLFYI S PVNKLVG 
ALMTVLSLFPGMIEHGLSDCSQYRPRKSMSEDGGLQESNPCADD 
FVSASTADVSHTNLGTI RKVMAGNHGEDAAMKTEE PLFQVEDSS 
KGQEPNDTNQYLKPPSRPSPDSSESDWETLDPSVLEDPNLKERE 
QLGSDQTNLFPKDS VPS ESLP I TVQPQANTGQ WLI PGLI SGLE 
EDQYGM PLAI FTKG YLCLPYMALQQHHLLSDVTVRGFVAGATNI 
LFRQQKHLSDAIVEVEEALIQIHDPELRKLLNPTTADLRFADYL 
VRHVTENRDDVFLDGTGWEGGDEWIRAQFAVYIHALLAATLQLV 
LFRI VNVAKKI GNVMVTT\ SRNWQTGK\AVGQSVGGAFS \ SAK 
TA\ MS S WLSTFTTS T SQS LTE P PDEKP 


*000 


101 


1561 


TEPCRTAENCTATMSENNKNSLESSLRQLKCHFTWNLMEGENSL 
DDFEDKVFYRTEFQNREFKATMCNLLAYLKHLKGQNEAALECLR 
KAEEL IQQEHADQAE IRSLVTWGNYAWVYYHMGRLSDVQI YVDK 
VKHVCEKFSSPYRIESPELDCEEGWTRLKCGGNQNERAKVCFEK 
ALEKKPKNPEFTSGLAIASYRLDNWPPSQNAIDPLRQAIRLNPD 
NQYLKVLLALKLHKMREEGEEEGEGEK\LVEEALEKAPG\VTDV 
LRSAA\ KFYRGKDEPDKAI ELLKKALE Y I P\NNAYLHCQIGCCY 
RAKV FQ VMNLRE NGM YGKRKLL E L IGHAVAHLKKADEANDNL FR 
VCSILASLHALADQYEDAEYYFQKEFSKELTPVAKQLLHLRYGN 
FQL YQMKCE DKAI HH F I EG VK INQKS RE KE KMKDKLQKI AKMRL 
SKNGADSEALHVLAFLQELNEKMQQADEDSERGLESGSLIPSAS 
SWNGE 


6001 


176 


1038 


AFAHSPSRGHRHTHIHTPRHTPRCTMAESHLQSSLITASQFFEI 
WLHFDADGSGYLEGKELQNLIQELQQARKKAGLELSPEMKTFVD 
QYGQRDDGKIGIVELAHVLPTEENFLLLFRCQQLKSCE\EFMKT 
WRKYDTDHSGF I ETE ELKNFLKDLLEKANKTVDDTKLAE YTDLM 
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amino acid 
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amino acid 
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Predicted end 
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amino acid 
sequence 


Amino acid segment containing signai peptide 
<A=Alanine, Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M^Methionine, N^Asparagine, 
P«Proline, Q=Glut amine, R=Arginine, 
S -Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








L KL FDS NNDG KLE LTEMARLL P VQEN FL LKFQG I KMCG KE FNKA 
FELYDQDGNGYIDENELDALLKDLCEKNKQDLDINNITTYKKNI 
MALSDGGKLYRTDLALILCAGDN 


6002 


911 


81 


LAPPGGGLHIPPRTPLSHSRPPPSHHAPHPSPLPLPPADLHPHS" 

S MAQRSDLLELDCQLTRDR VWVS HDENL CRQ SGLNRD VGS LD F 

EDLPLYKEKLEVYFSPGHFAHGSDRRMVRLEDLFQRFPRTPMSV 

EIKGKNEELIREQ/VLVRRYDRNEITIWASEKSSVMKKCKAANP 

EMPLSFTISRGFWVLLSYYLGLLPFIPIPEKFFFCFLPNIINRT 

YFPFSCSCLNQLLAWSKWLIMRKSLIRHLEERGVQWFWCLNE 

ESDFEAAFSVGATGVITDYPTALRHYLDNHGPAARTS 


6003 


14 0 


4098 


GKLRAFRGMRRLI CKR I CDYKS FDDEES VDGNRPSSAASAFKVP 

AP KTSGNPANSARKPG SAGG P KVGAGAS KEGGAGAVDEDDF I KA 

FTDVPSIQIYSSRELEETLNKIREILSDDKHDWDQRANALKKIR 

SLLVAGAAQYDCFFQHLRLLDGALKLSAKDLRSQWREACITVA 

HLSTVLGNKFDHGAEAIVPTLFNLVPNSAKVMATSGCAAIRFII 

RHTH VPRL I P L I TSNCTS KS VP VRRRS FE FLDLLLQE WQTHSLE 

RHAAVLVET I KKG I HDADAE AR VE ARKTYMGLRNHFPGEAETL Y 

NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 

KWSTANPSTVAGRVS AGSS KAS SLPGSLQRSRSDIDVNAAAGAK 

AHHAAGQS VRSGRLG AG ALNAG S YAS LED TS D KLDGTAS EDGRV 

RAKLSAPLAGMGNAKADSRGRSRTKMVSQSQPGSRSGS PGRVLT 

TTALSTVSSGVQRVLVNSASAQKRSKIPRSQGCSREASPSRLSV 

ARSSRIPRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS 

TGALYAPEVYGASGPG YG I SQSSRLS SSVSAMRVLNTGSDVEEA 

VADALLLGDIRTKKKPARRRYESYGMHSDDDANSDASSACSERS 

YSSRNGSIPTYMRQT\EDV\AEVLNRCASSNWSERECEGLLGLQN 

LLKNQRTLSRVELKRLCEIFTRMFADPHGKRVFSMFLETLVDFI 

QVHKDDLQDWLFVLLTQLLKKMGADLLGSVQAKVQKALDVTRES 

FPNDLQFNILMRFTVDQTQTPSLKVKVAILKYIETLAKQMDPGD 

F INSS ETRLAVS R VI TWTTE PKSS D VRKAAQS VL I SLFE LNT PE 

FTMLLGALPKTFQDGATKLLHNHLRNTGNGTQSSMGSPLTRPTP 

RSPANWSSPLTSPTNTSQNTLSPSAFDYDTENMNSEDIYSSLRG 

VTEAIQNFSFRSQEDMNEPLKRDSKKDDGDSMCGGPG\MSDPRA 

GGDATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 

PFNKSALKEAMFDDDADQFPDDLSLDHSDLVAELLKELSNHNER 

VEERKIALYELMKLTQEESFSVWDEHFKTILLLLLETLGDKEPT 

IRAIJujKVLREILRHQPARFKNYAELTVMKTLEAHKDPHKEVVR 

S AEE AAS V \ LATS I \ S P EQC I KVLCP 1 1 QTADYP I NIiAAI KMQT 

KVIERVS KETLNLLLPE IMPGL I QGYDNSESSVRKACVFCLVAV 

HAVIGDELKPHLSQLTGS KMKLLNLY I KRAQTGSGGADPTTDVS 

GQS 


6004 


140 


4098 


G KLRAFRGMRRL I CKR I CD Y KS FDDE ES VDGNR PS S AAS AF KVP 
APKTSGNPANSARKPGSAGGPKVGAGAS KEGGAGAVDEDDF I KA 
FTDVPSIQIYSSRELEETLNKIREILSDDKHDWDQRANALKKIR 
SLLVAGAAQYDCFFQHLRLLDGALKLSAKDLRSQVVREACITVA 
HLSTVLGNKFDHGAEAI VPTLFNLVPNSAKVMATSGCAAIRFI I 
RHTHVPRLIPLITSNCTSKSVPVRRRSFEFLDLLLQEWQTHSLE 
RHAAVLVET I KKG I HDADAEAR VEARKT YMGLRNH FPG EAETL Y 
NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 
KWSTANPSTVAGRVSAGSSKASSLPGSLQRSRSDI DVNAAAGAK 
AHHAAGQSVRSGRLGAGALNAGS YAS LEDTSDKLDGTAS EDGRV 
RAKLS APLAGMGNAKADSRGRSRTKMVSQSQPGSRSGS PGRVLT 
TTALSTVS SGVQRVLVNS ASAQKRS KI PRS QGCSRBAS PS RLS V 
ARSSRIPRPSVSQGCSREASRESSRDTS PVRS FQPLASRHHSRS 
TGALYAPE VYGASG PG YG I SQ S S RLSS S VS AMRVLNTGSDVE EA 
VADALLLGDIRTKKKPARRRYESYGMHSDDDANSDASSACSERS 
YS SRNGS I PTYMRQT \ EDV\ AE VLNR CASSNWS ERKEGLLGLQN 
LLKNQRTLSRVELKRLCEIFTRMFADPHGKRVFSMFLETLVDFI 
QVHKDDLQDWLFVLLTQLLKKMGADLLGSVQAKVQKALDVTRES 
FPNDLQFNI LMR FTVDQTQTPS L KVK VAI LKY I ETLAKQMDPGD 
F I NS S ETRLAVS RV I TWTTEP KS SD VRKAAQS VL I S LFE LNTPE 
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amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Fa Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FTMLLGAL P KTFQDG AT KLLHNHLRNTGNGTQ S S MGS P LTR PTP 
RS PANWS S PLTS P TNTSQNT LS PS AFD YDTENMNS ED I YSSLRG 
VTEAIQNFSFRSQEDMNEPLKRDSKKDDGDSMCGGPG\MSDPRA 
GGDATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 
PFNKSALKEAMFDDDADQFPDDLSLDHSDLVAELLKELSNHNER 
VEERECIALYELMKLTQEESFSVWDEHFKTILLLLLETLGDKEPT 
IRALALKVLRE I LRH Q PAR FKNYAELTVMKT LEAH KDPHKEWR 
SAEEAASV\LATSI\SPEQCIKVLCPIIQTADYPINLAAIKMQT 
KVIERVSKETLNLLLPEIMPGLIQGYDNSESSVRKACVFCLVAV 
HAVIGDELKPHLSQLTGSKMKLLNLYIKRAQTGSGGADPTTDVS 
GQS 


S005 


133 


5955 


RSSGRRQEQLGQFPGRERKGMASGLGSPSPCSAGSEEEDMDALL - 

NNSLPPPHPENEEDPEEDLSETETPKLKKKKKPKKPRDPKIPKS 

KRQKKERMLLCRQLGDSSGEGPEFVEEEEEVALRSDSEGSDYTP 

GKKKKKKLG P KKEKKS KS KRKEEE E EDDDDDDDS KB P KSS AQLL 

EDWGMEDIDHVFSEEDYRTLTNYKAFSQFVRPLIAAKNPKIAVS 

KMMMVLGAKWREFSTNNPFKGS SGAS VAAAAAAAVAWESMVTA 

TEVAPPPPPVEVPIRKAKTKEGKGPNARRKPKGSPRVPDAKKPK 

P KKVAPLKI KLGGFGS KRKRS S SEDDDLDVESDFDDAS INS YS V 

S DGSTS RS SRSRKKLRTTKKKKKGEEEVTAVDG YETDHQDYCEV 

CQQGGEIILCDTCPRAYHMVCLDPDMEKAPEGKWSCPHCEKEGI 

QWEAKEDNSEGEEILEEVGGDLEEEDDHHMEFCRVCKDGGELLC 

CDTCPSSYHIHCLNPPLPEIPNGEWLCPRCTCPALKGKVQKILI 

WKWGQPPSPTPVPRPPDADPNTPSPKPLEGRPERQFFVKWQGMS 

YWHCS W VSE LQLELHC \ QVMFRNYQRKNDMD EP P S GD FGGDEEK 

S\RKRKNKDPKFAEMEERFYRYGIKPEW\MMIHRILNHSVDKKG 

HVH YL I KWRDLP YDQAS WE S EDVE I QD YDLF KQS Y WNHRE LMRG 

EEGRPGKKLKKVKLRKLERPPETPTVDPTVKYERQPEYLDATGG 

TLHPYQMEGLNWLRFSWAQGTDTILADEMGLGKTVQTAVFLYSL 

YKEGHSKGPFLVSAPLSTIIN\WEREFEMWAPDMYV\VTYVGDK 

DSRAI I RENE FS \ FEDNAI RGGKKASRMKKEAS VKFHVLLTS YE 

LITIDMAILGSIDWACLIVDEAHRLKNNQSKFFRVLNGYSLQHK 

LLLTGTPLQNNLEELFHLLNFLTPER FHNLEGFLEEFAD IAKED 

Q I KKLHDMLG \ PHMLRRLKAD VFKNM P S KTELI V\ RVELS PM \ Q 

KKY YK\ Y I LKS KFLKALN \ ARGGGNQ VS LLNWMDLKKCCNH P Y 

LFPVAAMEAPKMPNGMYDGSALIRASGKLLLLQKMLKNLKEGGH 

RVL I FS QMTKMLDLLED FLEHEG YKYER I DGG I TGNMRQEA I DR 

FNAPGAQQFCFLLSTRAGGLGINLATADTVIIYDSDWNPHNDIQ 

AFSRAHRIGQNKKVMIYRFVTRASVEERITQVAKKKMMLTHLVV 

RPGLGSKTGSMSKQELDDILKFGTEELFKDEATDGGGDNKEGED 

S SV I H YDDKA I ERLLDRNQDETEDTELQGMNE YLS S FKVAQ YW 

REEEMGEEEEVERE 1 1 KQEES VDPDYWEKLLRHHYEQQQEDLAR 

NLGKGKRIRKQVNYNDGSQEDRDWQDDQSDNQSDYSVASEEGDE 

DFDERSEAPRRPSRKGLRNDKDKPLPPLLARVGGNIEVLGFNAR 

QRKAFLNAI MR YGM P PQDAFTTQWLVRDLRGKS EKE FKAY VS L F 

MRHLCEPGADGAETFADGVPREGLSRQHVLTRIGVMSLIRKKVQ 

EFEHVNGRWSMPELAEVEENKKMSQPGSPSPKTPTPSTPGDTQP 

NTPAPVP PAEDG I KI EENS LKEEES I EGEKE VKSTAPETAI ECT 

QAPAPASEDEKVWEPPEGEEKVEKAEVKERTEEPMETEPKGKG 

AADVEKVEEKSAIDLTPIWEDKEEKKEEEEKKEVMLQNGETPK 

DLNDEKQKKNIKQRFMFNIADGGFTBLHSLWQNEERAATVTKKT 

YE IWHRRHDYWLLAGI INHGYARWQD IQNDPRYAI LNEPFKGEM 

NRGNFLEIKNKFLARRFKLLEQALVIEEQLRRAAYLNMSEDPSH 

PSMALNTRFAEVECLAESHQHLS KES MAGNKP ANAVLHKVL KQ L 

EE LLS DM KAD VTRLP ATXAR I PP VAVRLQMS ERN I LS RLANRAP 

EPTPQQVAQQQ 


6006 


1 


965 


DND FLRNTVHRHE P P VTAE P I RLLAENED WWD KPS S I P VHPC 
GRFRHNTVIFILGKEHQLKELHPLHRLDRLTSGVLMFAKTAAVS 
ER I HEQ VRDRQLE KE YVCRVEGE FPTE EVTCKEP I LWS YKVG V 
CR VDPRG KPCETVFQRLS YNGQS S WRCRPLTGRTHQ IR VHLQF 
LGHPILNDP I YNSVAWGPSRGRGGYIPKTNEELLRDLVAEHQAK 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline / Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QSLDVLDLCEGDLSPGLTDSTAPSSELGKDDLEELAAAA\QKME 
EVAEAAPQELDTIALASEKAVETDVMNQ\RQT\TLCRVPAGATG 
SLAPRPCDVPTCPTL 


6007 


3 


2351 


HELGQ VE YVFTD KTGTLTE NEMQ FRE CS INGMKYQE INGRLVPE 
GPTPDSSEGNLSYLSSLSHLNNLSHLTTSSSFRTSPENETELIK 
EHDLFFKAVSLCHTVQINNVQTDCTGDGPWQSNLAPSQLEYYAS 
SPDEKALVEAAARIGIVFIGNSEETMEVKTLGKLERYKLLHILE 
FDS DRRRMS V I VQAPS GE KLLFAKG AESS ILPKCIGGEIE KTR I 
HVDEFAIiKGLRTLCIAYRKFTSKEYEEIDKRIFEARTALQQR\E 
E KLAAVFQ F I E KDLI LLGATAVE DRLQDKVR ET I EALRMAG I KV 
WVLTGDKHETAVSVSLSCGHFHRTMNILELINQKSDSECAEQLR 
QLARR I TEDHVI QHGL WDGTS LS LALREHE KLFMEVCRNC S AV 
LCCRMAPLQKAKVIRLI KI S PE KP I TLAVGDGANDVSM I QEAHV 
G I G IMG KEGRQAARNSD YAI AR FKFLS KLL FVHGHFY Y I R I ATL 
VQYFFYKNVCFITPQFLYQFYCLFSQQTLYDSVYLTLY\NICFT 
SLPIL I YSLLEQHVDPHVLQNKPTLYRD IS KNRLLS IKTFLYWT 
ILGFSHAFIFFFGSYLLIGKDTSLLGNGQMFGNWTFGTLVFTVM 
VITVTVKMALETHFWTWINHLVTWGSIIFYFVFSLFYGGILWPF 
LGSQWMYFVF IQLLSSGSAWFAI ILMWTCLFLD 1 1 KKVFDRHL 
HPTSTEKAQLTETNAGIKCLDSMCCFPEGEAACASVGRMLERVI 
GRC S PTH I SRS WS AS DP FYTNDRS I LTL STMDS ST C 


6003 


4554 


1089 


AGVRRAGARRG PGRALP AGATAVP PPS ARRRRRC PAP EHAG PAR ' 
ASRPSQETMFQLPVNNLGSLRKARKTVKKILSDIGLEYCKEHIE 
DFKQFEPNDFYLKNTTWEDVGLWDPSLTKNQDYRTKPFCCSACP 
FSSKFFSAYKSHFRNVHSEDFENRILLNCPYCTFNADKKTLETH 
I KI FHAPNASAPSSSLSTFKDKNKNDGLKPKQADS VEQAVY YCK 
KCT YRD PL YE I VRKH I YRE HFQHVAAP Y I AKAGE KS LNGAVP LG 
S NAREE S S I HCKRCL FM PKS YEALVQHVT EDHER I G YQ VTAM I G 
HTNVWPRSKPLMLIAPKPQDKKSMGLP PRIGS LASGNV\RS LP 
SQQMVNRLSIPKPNLNSTGVNMMSSVHLQQNNYGVKSVGQGYSV 
GQSMRLGLGGNAP VS I PQQ S QS VKQLL P S GNGR S YG LGSEQRS Q 
APARYSLQSANASSLSSGQLKSPSLSQSQASRVLGQSSSKPAAA 
ATG P P PGNTS STQKWK I CT I CNE LFPENVYS VHFE KEHKAE KVP 
AVANY I MK I HNFTS KCL YCNR YLPTDTLLNHML I HGLS CP YCRS 
T FNDVEKMAAHMRM VH I DEEMG P KTDSTLS FDLTLQQGS HTN I H 
LLVTTYNLRDAPAES VAYHAQNNPPVPP KPQPKVQEKADI PVKS 
S PQAAVPYKKDVGKTLCPLCFS ILKGPISDALAHHLRERHQVIQ 
TVHPVEKKLTYKCIHCLGVYTSNMTASTITLHLVHCRGVGKTQN 
GQDKTNAPS RLNQS PSLAP VKRT YEQME FPLLKKRKLDDDSDS P 
SFFEEKPEEPWLALDPKGH\EDDSYEARKSFLTKYFT\KQPYP 
TRRE IEKLAAS LWV \ W K\ SD I AS H FSNKRKKCVRDCE KYKPGVL 
LGFNMKELNKVKHEMDFDAEGLFENHDEKDSRVNASKTADKKLN 
LGKEDDSSSDSFENLEEESNESGSPFDPVFEVEPKISNDNPEEH 
VLKVIPEDASESEEKLDQKEDGSKYETIHLTEEPTKLMHNASDS 
EVDQDDWEWKDGASPSESGPGSQQVSDFEDNTCEMKPGTWSDE 
SSQSE DARS S KPAAKKKATMQGDREQLKWKNS S YGKVEG FWSKD 
QSQWKNASENDERLSNPQIEWQNSTIDSEDGEQFDNMTDGVAEP 
MHGSLAGVKLSSQQA 


6009 


4272 


1534 


CHG LQHLTP FRELNLS LQG * E PH * AA* QAVRS E EKS I C * GS PSC " 
HLVLGVLVP VARQS SHSAG P AQS AFR * TGTGSGTPKAAEQS GY W 
EAYTLGHQHWNMFPIQRPPLVMKGRRIMCGKCEKG*VSDSVTGG 
RAVAGEQAS QRRT VFTAGGGECLGAKS VRAS VFTGNQ PGVMGLL 
NGKRGGCFESGYLFGFIVIGKIQSLEAKVPLPVNGQTGERASPG 
NCR I H I VDAVC * S EHH * DHFLAAAFLENST 1 1 S * V7APGS WQDHA 
VLQ KE VQ AS VRCRGFE S VDTAPAGFWAHS PPGLQGEPTTTS VSL 
FVLAPQDGEGVP FVEGQLVTVLGL WPQS I RHTFVHHTQL FLHP 
I * KLGALDVAFLHLLTLVCS SFNVAYG *GKNGGTTLHQLFAEVN 
AVTRGSAVQRRPS ITISS IHVDTKIQQELHDVMVAGADGWQWG 
DPFWGLAGIFHLIDDPLHQIELSFQRRV*EQCQGVKPDSQPVP 
RPLRVGLLQVGPLVRGGGRRVAGRGKRCWRDLLFPWRWGLSHRT 
RDLLRGGDRGHVWI VLCRLGSLVGGLGTDE LL W FGGR * L 1 1 IG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine / R=Arginine, 
S=»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








I * * RGRLS GE WG CGLGRGE LFQ VS I G IG VS I VH I GQGDHJS VLGG 
AGL VERGALHATG QG VE AL VQQLLDVGPAGALGLCDG AALFQG P 
GRVGQLPAEGLQVCITLVAQWRMHDGRELGGAEWPWQALHGAAI 
CGVGGAILLKALSQYFLKGG*RLWCARGQ*PVKKRQRRWRG*TR 
R * NG LTIHCFN * L I *GAVCCRLVI LRWCGLLEVHGVYGT * I HCL 
GSFPGRLWP*PFISQERPNGHCQWEFRLAVPSWKCRWSRWRVRG 
TWRYGNPLLNLL*GAWLGGAACGGQQGGPLSTWQACTGPGQAAF 
LP PFQGACRPRTQRCRTWVCP IAWRQLLAYTRD 


6010 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVMENSKVLGESM 
AG I S QNAKTGDLPAFG E CVG I AS KALCGLTEAAAQAAYLVG I FD 
PNS QAGHQGL VDP I QFARANQA I QMACQNLVDPG S S P SQ VLS AA 
TIVAKHTSALCNACRIASSXTANPVAKRHFVQSAKEVANSTANL 
VKTI KALDGDFS EDNRNKCRI ATAPLI EAVENLTAFASNPEFVS 
IPAQISSEGSQAQEPILVSAKPMLESSSYLIRTARSLAINPKDP 
PTWS VLAGHSHTVSDS I KS L I TS I RDKAPGQRE CDYS IDG INRC 
IRDIEQASLAAVSQSLATRDDISVEALQEQLTSWQEIGHLIDP 
I ATAARGEAAQLGHKGTQLAS YFBPL I LAAVGVAS KILDHQQQM 
TVLDQTKTLAESALQMLYAAKEGGGNP KAQHTHDAI TE AAQLMK 
EAVDD I MVTLNEAASEVGLVGGMVDAIAEAMSKLDEGTPPE P KG 
TFVDYQTTWKYSKAIAVTAQEMMTKSVTNPEELGGIiASQMTSD 
YGHLAFQGQMAAATAEPEE IGFQI RTRVQDLGHGC I FLVQKAG\ 
ALQVCPTDSYTKRELIECARAVTEKVSLVLSALQAGNKGTQACI 
TAATAVSGI I ADLDTT I M FATAGTLNAENS E TFADHREN I L KTA 
KALVEDTKLLVSGAAST PDKLAQAAQS SAATITQLAE WKLGAA 
SLGSDDPETQWLINAI KDVAKALSDL I SATKGAAS KPVDDPSM 
YQLKGAAKVMVTNVTSLLKTVKAVEDEATRGTRALEAT IEC I KQ 
ELTVFQS KDVPEKTSSPEES I RMTKG ITMATAKAVAAGNS CRQE 
DVIATANLSRKAVSDMLTACKQASFHPDVSDEVRTRALRFGTEC 
TLGYLDLLEHVLVILQKPTPELKQQLAAFSKRVAGAWELIQAA 
EAMKGTE WVDPBDPTVIAETELLGAAAS I EAAAKKLEQLKPRAK 
P KQADETLDFEEQ I LEAAKS I AAATS AL VKSAS AAQREL VAQGK 
VGS I PANAADDGQWS QGL I S AARMVAAATSSLCEAANAS VQ GHA 
SEEKLISSAKQVAASTAQLLYACKVKADQDSEAMRRLQAAGNAV 
KRASDNLVRAAQKAAFGKADDDDVWKTKFVGGI AQI IAAQEEM 
LKKERELEEARKKLAQIRQQQYKFLPTELREDEG 


6011 


446 


1835 


LLQPAMRKSPGLSDCLWAWILLLSTLTGRSYGQPSLQDELKDNT 
TVFTR ILDRLLDG YDNRLRPGLGERVTEVKTDI FVTS FGPVSDH 
DME YT I D VF FRQS W KDERLKF KG PMTVLRLNNLMAS K I WTPDTF 
FHNGIOCSVAHNMTMPNKIiLRITEDGTLLYTMRLTVR\AECPMAF 
GRDFPM\D\AHACPLKFGSYAYTRAEWYEWTREPARSVWAED 
GS RLNQ YDLLGQT VDS GI VQ S S TGE YWMTTHFHL KRKI GYFV I 
QTYLPCIMWILSQVSFWLNRESVPARTVFGVTTVLTMTTLSIS 
ARNSL P KVAYATAMDW FIAVCYAFVFS AL IEFATVNYFTKRG YA 
WDGKSWPEKPKKVKDPLIKKNNTYAPTATSYTPNLARGDPGLA 
TIAKSATIEPKEVKPETKPPEPKKTFNSVSKIDRLSRIAFPLLF 
G I FNLVYWATYLNREPQLKAPTPHQ 


6012 


351 


5013 


PAELFQSFAIWHKELYDWRLGPWNQCQPVISKSLEKPLECIKGE 
EGIQVREIACIQKDKDIPAEDirCEYFEPKPLLEQACLIPCQQD 
CIVSEFSAWSECSKTCGSGLQHRTRHWAPPQFGGSGCPNLTEF 
QVCQSSPCEAEELRYSLHVGPWSTCSMPHSRQVRQARRRGKNKE 
REKDRSKGVKDPEAREL I KKKRNRNRQNRQENKYWD IQ IG YQTR 
EVMCINKTGKAADLSFCQQEKLPMTFQSCVITKECQVSEWSEWS 
PCS KTCHDMVS PAGTRVRTRTIRQ FP IGS EKECPE FEEKEPCLS 
QGDGWPCATYGWRTTEWTECRVDPLLSQQDKRRGNQTALCGGG 
IQTREVYCVQANENLLSQLSTHKNKEASKPMDLKLCTGPIPNTT 
QLCHIPCPTECEVSPWSAWGPCTYENCNDQQGKKGFKLRKRRIT 
NE P TGGS GVTGNC PHLLEAI P CEE PAC YDW KAVRLGDCE PDNG K 
ECGPGTQVQEWCINS DGEEVDRQLCRDAI FP I PVACDAPCPKD 
CVLSTWSTWSSCSHTCSGKTTEGKQIRARS I LAYAGEEGGIRCP 
NSSALQE VRS CNEH P CTVYHWQTGP WGQC I EDTS VS S FNTTTTW 
NGEASCSVGMO/H^IC^Vl^GQVGPKKCPESLRPETVRPCLL 



426 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine / D«Acpartic Acid, E«* 
Glutamic Acid, F» Phenylalanine, G=Glycine, 
H~Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PCKKDCIVTPYSDWTSCPS\SCKEGDSSIRKQSRHRVIIQLPAN 
GGRDCTDPLYEEKACEAPQACQSYRW\KTHKW\HRCQ\LVP\WS 
VQQDS P \GAQEGOGPGRQARAI TCRKQDGGQAGIHECLQ YAGP V 
PALTQACQI PCQDDCQLTS WS KFS S CNGD CGAVRTRKRTL VGKS 
KKKEKCKNSHLYPLIETQYCPCDKYNAQPVGNWSDCILPEGKVE 
VLLGMKVQGD I KECGQG YR YQAMAC YDQNGRLVETS RCNS HG Y I 
EEACIIPCPSDCKLSEWSNWSRCSKSCX3SGVKVRSKWLREKPYN 
GGR PC PKLDHVNQAQ VYEWP CHSDCNQ YLWVTEP WS I CKVTFV 
NMRENCGEGVQTRKVRCMQNTADGPSEHVEDYLCDPEEMPLGSR 
VCKLPCPEDCVISEWGPWTQCVLPCNQSSFRQRSADPIRQPADE 
GRS CPNAVEKE P CNLNKNC YH YD YNVTDW S T CQLSE KAVCGNG I 
KTRMLDCWS DGKS VDLK YCEALGLE KNWQMNTS CMVE C P VNCQ 
LSDWSPWSECS QTCGLTGKM I RRRTVTQP FQGDGR P CPSLMDQ S 
KP C P VKPC YRWQ YGQ WS P CQ VQ EAQCGEGTRTRNI S CWS DGS A 
DD FS KWDB E FCAD I E L 1 1 DGNKNMVLE E S CS QPCPGDCYL KDW 
S S WS LCQLTCVNGED LGFGGIQVRS R P V 1 1 QELENQKLCP EQML 
ETKSCYDGQCYEYKWMASAWKGSSRTVWCQRSDGINVTGGCLVM 
SQPDADRSCNPPCSQPHSYCSETKTCHCEEGYTEVMSSNSTLEQ 
CTL I P WVL P TMEDKRGDVKT S RAVH PTQ P S SNP AGRGRTW FLQ 
P FG PDGRLKTWVYGVAAGAFVLLI FI VSM I YLACKKPKKPQRRQ 
NNRLKPLTLAYDGDADM 


6013 


1161 


710 


GAFIAGVPVQPVLIRYPNSLDTTSWAWRGPGVLKVLWLTASQPC 
S I VDVEFLPVYHPSPEESRDPTLYANNVQRVMAQALGI PATECE 
FVGSLPVIWGRLKVALEPQL/WGTGKSASEGWAVRWLCGRWGR 
ARPESNDQPGRVCQAATAL 


6014 


2857 


613 


EAVAGGMEKSRMNLPKGPDTLCFDKDEFMKEDFDVDHFVSDCRK 
RVQLEELRDDLELYYKLLKTAMVELINKDYADF\VNLSTNLVGM 
DKALNQLS VP LGQLRE E VLS LRS S VS EG I RAVDERMS KQE DIRK 
KKMCVLRLIQVIRSVEKIEKILNSQSSKETSALEASSPLLTGQI 
LERIATEFNQLQFHACQSK\GMPLLDKVRPRIAGITAMLQQSLE 
GLLLEGLQTSDVDI IRHCLRTYAT I DKTRDAEALVGQVLVKP YI 
DEVIIEQFVESHPNGLQVMYNKLIjEFVPHHCRLLREVTGGAISS 
EKGNTVPGYDFLVNSVWPQIVQGLEEKLPSLFNPGNPDAFHEKY 

tismdfvrrlerqcgsqasvkrlrahpayhsfnkkwnlpvyfqi 
rfreiagsleaaltdvledapaespycllashrtwsslrrcwsd 

EMFLPLLVHRLWRIjHSGRFWARYSVFV\N\ELSLRPISNESPKE 
IKKPLVTGSKEPSITQGNTEDQGSGPSETKPWSISRTQLVYW 
ADLDKLQEQLPELLEIIKPKLEMIGFKNFSSISAALEDSQSSFS 
ACVPSLSSKIIQDLSDSCFGFLKSALEVPRLYRRTNKEVPTTAS 
SYVDSALKPIiFQLQSGHKDKLKQAIIQQWLEGTLSESTHKYYET 
VSDVLNSVKKMEESLKRLKQARKTTPANPVGPSGGMSDDDKIRL 
QLALDVEYLGEQIQKLGLQASDIKSFSALAELVAAAKDQATAEQ 
P 


! 6015 
> 


13 


2237 


AEGCAERRGTEPWELSMSWESGAGPGLGSQGMDLVWSAWYGKC 
VKGKGSLPLSAHGIWAWLSRAEWDQVTVYLFCDDHKLQRYALN 
RITVWRSRSGNELPLAVASTADLIRCKLLDVTGGLGTDELRLIiY 
GMALVRFVNLISERKTKFAKVPLKCLAQEVNIPDWIVDLRHELT 
HKKMPH I NDCRRGC YFVLD WLQ KT YWCRQLENS LRETWEL EE FR 
EGIEEEDQEEDKNIVVDDITEQKPEPQDDGKSTESDVKADGDSK 
GSEE VDS HCKKALS HKEL YERARE LL VS YEE EQFTVLEKFR YL P 
KAIKAWNNPSPRVECVLAELKGVTCENREAVLDAFLDDGFLVPT 
FEQLAALQIEYEENVDLNDVLVPKPFSQFWQPLLRGLHSQNFTQ 
ALLERMLSELPALGISGIRPTYILRWTVELIVANTKTGRNARRF 
SAGQWEARRGWRLFNCSAS LDW PRMVES CLGS PCWAS PQLIjRI I 
F\KAMGQGLQDE\EQEKLLRICSIYTQSGENSLVQEGSEASPIG 
KSPYTLDSLYWSVKPASS S FGSEAKAQQQEEQGS VNDVKEEEKE 
EKEVLPDQVEEEEENDDQEEEEEDEDDEDDEEEDRMEVGPFSTG 
QESPTAENARLLAQKRGALQGSAWQVSS3DVRWDTFP\LGRMPR 
SRPRTPAELMLENYDTHVI FWTKPVL \ EQRLEPSTCK\TDTLGL 
\SCGVGS\GNCSNSSSSNFRGAFLLEARGSLH\GL\KTGLQLF 


6016. 


13 


2237 


AEGCAERRGTEPWELSMSWESGAGPGLGSQGMDLVWSAWYGKC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Methionine, N«=Asparagine, 
P=Proline, Q-Glutaraine, R=Arginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKGKGSLPLSAHGIWAWLSRAEWDQVTVYLPCDDHKLQRYALN 
R I TVWR S RSGNEL P LAVAS TADL I RCKLLD VTGGLGTDELRLLY 
GMALVRFVNLI SERKTKFAKVPLKCLAQE VNI PDW I VDLRHELT 
HKKMPHINDCRRGCYFVLDWLQKTYWCRQLENSLRETWELEEFR 
EG I EEEDQEEDKN I VVDD I TEQKPEPQDDGKSTESDVKADGDSK 
GSEEVDSHCKKALSHKELYERARELLVSYEEEQFTVLEKFRYLP 
KAI KAWNNPS PRVECVLAELKGVTCENREAVLDAFLDDGFLVPT 
F E QLAALQI E YEENVDLNDVLVPKPFS QFWQPLLRGLHSQNFTQ 
ALLERMLS ELPALGI SGI RPTYI LRWTVEL I VANTKTGRNARRF 
SAGQWEARRGWRLFNCSASLDWPRMVESCLGSPCWASPQIiLRII 
F\KAMGQGLQDE\EQEKLLRICSIYTQSGENSLVQEGSEASPIG 
KSPYTLDSLYWSVKPASSSFGSEAKAQQQEEQGSVNDVKEEEKE 
E KEVLPDQ VE E EEENDDQEE EE ED EDDEDDE E EDRM E VG PFS TG 
QESPTAENARliLAQKRGALQGSAWQVSSEDVRWDTFP\LGRMPR 
SRPRTPAELMLENYDTHVIFWTKPVL\EQRLEPSTCK\TDTLGL 
\ S CG VGS \ GNC SNSS S S NFRGAFLLEARGS LH \ GL \ KTG LQLF 


6017 


203 


3469 


SHQE IEQNSAMAPRKRGGRGI S F I FCCFRNNDHPE I T YRLRNDS 

NFALQTMEPALPMPPVEELDVMFSELVDELDLTDKHREAMFALP 

AEKKWQIYCSKKKDQEENKGATSWPEFYIDQLNSMAARKSLLAL 

EKEEEEERSKTIESLKTALRTKPMRFVTRFIDLDGLSCILNFLK 

TMDYETSESRIHTSLIGCI KALMNNSQGRAHVLAHSES INVIAQ 

SLSTENIKTKVAVLEILGAVCLVPGGHKKVLQAMLHYQKYASER 

TR FQ TL INDLD KS TGR YRDE VS LKTA I MS F I NAVLS QGAG VES L 

D FR LHLRYE \ FLMLG I H P VMDKLRKHENS TLDRHLD F FEMLRNE 

DELEFAKRFELVHIDTKSATQMFELTRKRLTHSEAYPHFMSILH 

HCLQMP YKRSGNTVQYWLLIiDRI IQQ I VIQNDKGQD PDS TPLEN • 

FNI KNWRMLVNENEVKQWKEQABKMRKEHNELQQKLE KKE REC 

DAKTQEKEEMMQTLNKMKEKLEKETTEHKQVKQQVADLTAQLHE 

LSRRAVCASIPGGPSPGAPGGPFPSSVPGSLLPPPPPPPLPGGM 

LPPPPPPLPPGGPPPPPGPPPLGAIMPPPGAPMGLALKKKSIPQ 

PTNALKSFNWSKLPENKLEGTVWTEIDDTKVFKIIiDLEDLERTF 

S AYQRQQDFF VNSNS KQ KEADAI DDTLS S KLKVKELS V I DGRRA 

QNCNILLSRLKLSNDEIKRAILTMDEQEDLPKDMLEQLLKFVPE 

KSDIDLLEEHKHELDRMAKADRFIiFEMSRINHYQQRLQSLYFKK 

KFAER VAE VKP KVEAIRSQSEEVFRS GALKQLLE VVLAFGN YMN 

KGQRGNAYGFKISSLNKIADTKSSIDKNITLLHYLITIVENKYP 

SVLNLNEELRDI PQAAKVNMTELDKE ISTLRSGLKAVETELEYQ 

KSQPPQPGDKFVSWSQFITVASFSFSDVEDLLAEAKDLFTKAV 

KHFGEEAGKIQPDE FFG I FDQFLQAVSEAKQENENMRKKKEEEE 

RRARMEAQLKEQRERERKMRKAKENSEESGEFDDLVSALRSGEV 

FDKDLS KLKRNRKRITNQMTDS SRERP I TKLNF 


6018 
6019 


13 
2 


2510 
1066 


T I SQSGG I RRRREAVWFE WNMDFSRLHMYSPPQCVPENTG YTY 
AliSSSYSSDALDFETEHKLDPVFDSPRMSRRSLRLATTACTLGD 
GEAVGADSGTSSAVSLKNRAARTTKQRRSTNKSAFSINHVSRQV 
TS SGVS YGGTVS LQDAVTRR P P VLDES WI REQTTVDHFWGLDDD 
GDLKGGNKAAIQGNGDVGAGAATGHNGFFCSNCNMLSERKDVLT 
AHPAAPGPVSRVYSRDRNQKCDDCKGKRHLDAHPGRAGTLWHIW 
ACAG YFLLQ I LRR I GAVGQAVS RTAW S ALWLA WA PG KAASG VF 
WWLGIGWYQFVTIilSWLNVFLLTRCLRNICKFLVLLIPLFLLLG 
LSLRGQG\NFFSFLPVLNWASMHRTQRVDDPQDVFKPTTSRLKQ 
PLQGD S EAFP WH WMSG VEQQVAS L SGQ CHHHGENLRE LTTLLQK 
LQARVDQMEGGAAG PS AS VRDAVGQP P RETDFMAFHQEHE VRMS 
HLEDILGKLREKSEAIQKELEQTKQKTISAVGEQLLPTVEHLQL 
ELDQLKSELS SWRJHWCTGCETVDAVQERVD VQVREMVKLLFS ED 
QQGGSLEQLLQRFSSQFVSKGDLQTMLRDLQLQILRNVTHHVSV 
TKQLPTS EAWS AVSEAGASG I TEAQARAI VNS ALKLYSQDKTG 
MVDFALESGGGSILSTRCSETYETKTALMSLFGIPLWYFSQSPR 
WIQPDIYPGNCWAFKGSQGYLWRLSMMIHPAAFTLEHIPKTL 
SPTGNISSAPKDFAVYGLENEYQEEGQLLGQFTYDQDGESLQMF 
QALKRPDDTAFQ I VELR I FSNWGHPE YT CL YRFRVHGE P VX 
TPNDREPP PQRPPSSRRASHLAQE I TSAASLGDQTQ I LGSLTTA 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Ieoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PVITSAIRSMPGISSQILTNAQGQVIGTLPWWNSASVAAPAPA ' 
QS LQVQAVTPQLLLNAQGQVIATLASS PLP PPVAVRK\ PSTPES 
LLKSEVQPIKPTPTVPQPAWIASPAPAAKPSASAPIPITCSET 
PTVS Q L VS KPHTPS LDEDG I NLE EIRE FAKNFKI RR LS LG LTQT 
QVGQALTATEGPAYSQSAICRFEKIiDITPKSAQKLKPVLEKWLN 
EAELRNQEGQQNLMEFVGGEPSKKRKRRTSFTPQAIEALNAYFE 
KN PLP TGQE I TE I AKE LNYDRE WRVW FCNRRQT LKNTS KLNVF 
QIP 


6020 


4953 


549 


EAIQFEVSIGNYGNKFDTTCKPLASTTQYSRAVFDGNYYYYLPW 

AHTKP WT L TS Y WED I S HRLDAVNTLLAMAE RLQTN I E ALKS G I 

QG KI PANQ LAE L WLKL I DEVI E DTRYTLP LTEG KANVT VLDTQ I 

RKLRSRSLSQI HEAAVRMRS EATDVKSTLAE I EDWLDKLMQLTE 

EPQNSMPD 1 1 1 WMI RGE KRLAYARI PAHQVLYSTSGENASGKYC 

GKTQTIFLKYPQEKNNGPKVPVELRVNIWLGLSAVEKKFNSFAE 

GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 

FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 

GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 

AVDEKGWEYGITIPPDHKPKSWVAAEKMYHTHRRRRLVRKRKKD 

LTQTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 

RWRRKMAP S ETHGAAAI FKLEGALGADTTEDGDEKS LEKQKHSA 

TT VFGANT P I VS CNFDRD Y I YHLRC YVYQARNLLALDKD S FSDP 

YAHICFLHRSKTTEI IHSTLNPTWDQTI IFDEVEIYGEPQTVLQ 

NPPKVIMELFDNDQVGKDEFLGRSIFSPVVKLNSEMDITPKLLW 

HPVMNGDKACGDVLVTAELILRGKDGSNLPILPPQRAPNLYMVP 

QGIRPWQLTAI EILAWGLRNMKNFQMAS ITSPSLWECGGERV 

ESWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 

FGRKPVVGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 

DIVIEMEDTKPLLASKCLSSMSTALSKMASPATVHLTEKEEEIV 

DWWSKFYASSGEHEKCGQYIQKGYSKLKIYNCELENVAEFEGLT 

DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 

RQFREL PDS VPQE CTVR I Y I VRGLELQ P QDNNGLCDPY I K I TLG 

KKVIE\DRDHYIPNTI^PVFGRMYELSCYLPQEKDLKISVYDYD 

TFTRDE KVGET 1 1 DLENP F \ L SRFG\ SH CG \ I P E E YC VS G VNT W 

rdslr\ptq\ llqnvar fkg FPQPILSEDGSRI R YGGRD ys LDE 

FEANKI LHQHLGAP EER LALH I LRTQGLVPEHVETRTLHS TFQ P 

NIS\RYYLRVI I WNTKDVILDEKS ITGEEMSDI YVKGWIPGNEE 

NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 

S I DQTE FR I P PR \ LI IQ I W \ DNDKFS \ LDDYLG FPRTLT CRHT I 

HFLQKSPGGNC/RGLDMIPDLKAMNPLKAKTASLFEQKSMKGWW 

PCYAE KDG AR VMAG KVEMTLE I LNE KEADERPAGKGRDEPNMNP 

KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWVIIGLIiFLLILL 

LFVAVLLYSLPNYLSMKIVKPNV 


6021 


4953 


549 


E AI QFE VS I GNYGNKFDTTCKPLAS TTQ YS RAVFDGN Y YY YL P W 
AHTKPWTLTSYWEDISHRLDAVNTLIAMAERLQTNIEALKSGI 
QGKI PANQLAELWLKLIDEVI EDTRYTLPLTEGKANVTVLDTQ I 
RKLRSRS LS Q I HEAAVRMRS E ATD VKS TLAE IEDWLD KLMQLTE 
EPQNSMPD 1 1 IWM I RGE KRLAYARI PAHQVLYSTSGENASGKYC 
G KTQT I FLK YPQ EKNNGPKVP VELRVN I WLGLS AVE KKFNS FAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDT YTDANGDKAAS PS E LTCPPGWEWEDDAWS YDINR 
AVDEKGWEYGITIPPDHKPKSWVAAEKMYHTHRRRRLVRKRKKD 
LTQTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPSETHGAAAIFKLEGALGADTTEDGDEKS LEKQKHSA 
TTVFGANTPI VSCNFDRDYI YHLRCYVYQARNLLALDKDS FSDP 
YAHI CFLHRS KTTE I IHS TLNPTWDQTI I FDEVE I YGE PQTVLQ 
NP PKVIMEL FDNDQVGKDE FLGRS I FS P WKLNSEMD I T P KLLW 
HP VMNGD KACG DVLVTAE LI LRGKDGSNLP I L P PQRAPNLYMVP 
QG I RP WQLTAI E I LAWGLRNMKNFQMAS I TS PS LWE CGGERV 
ES W I KNLKKT PNFPS S VLFMKVFLPKE E L YMPP LV I KV I DHRQ 
FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide""" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y»Tyrooine, X= Unknown , +-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DIVIEMEDTKPLLASKCLSSMSTALSKMASPATVHLTEKEEEIV 
DWWSKFYASSGEHEKCGQYIQKGYSKLKIYNCELENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRELPDSVPQECTVRIYIVRGLELQPQDNNGLCDPYIKITLG 
KKVIE\DRDHYIPNTLNPVFGRMYELSCYLPQEKDLKISVYDYD 
TFTRDEKVGETI IDLENPF\LSRFG\ SHCG\ I PEEYCVSGVNTW 
RDSLR\PTQ\LLQNVARFKGFPQPILSEDGSRIRYGGRDYSLDE 
F EANKILHQHLGAPE ERIALH I LRTQGLVP EHVETRTLHSTFQP 
N I S \ R YYLR VI I WNTKD VI LDE KS I TGE EMSD I YVKGW I PGNE E 
NKQKTDVHYRSLEX3EGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
S I DQTEFR IPPR\LIIQI W \DNDKFS \LDDYLGF PRTLTCRHT I 
HFLQKSPGGNC/RGLDMIPDLKAMNPLKAKTASLFEQKSMKGWW 
PCYAEKDGARVMAGKVEMTLE I IiNEKEADERPAGKGRDEPNMNP 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWVIIGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6022 


4953 


549 


EA I Q F E VS I GNYGN KFDTTCKPLAS TTQ YS RAVFDGN Y Y Y YLPW 
AHTKPVVTLTSYWEDISHRLDAVNTLIAMAERLQTNIEALKSGI 
QGKI P ANQLAEL WLKL I DEVI EDTRYTLPLTEGKANVTVLDTQI 
RKLRSRSLS Q IHEAAVRMRS EATDVKSTLAE I EDWLDKIjMQLTE 
EPQNSMPDI 1 1 WM I RGE KRLA YAR I PAHQVL YS TSGENASGK YC 
GKTQT I FL K YPQEKNNG P KVP VELRVN I WLGLS AVE KKFNS FAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVDEKGWEYGITIPPDHKPKSWVAAEKMYHTHRRRRLVRKRKKD 
LTQTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPSE THGAAA I FKL EGALGADTTEDGDE KS L EKQ KHS A 
TTVFGANTP I VS CNFDRD YI YHLRC YVYQARNLLALDKDS FSDP 
YAH I CFLHRSKTTE I IHSTLNPTWDQT 1 1 FDEVE I YGEPQTVLQ 
NPPKVIMELFDNDQVGKDEFLGRS I FS P WKLNS EMDITPKLLW 
HPVMNGDKACGDVLVTABLILRGKDGSNLPILPPQRAPNLYMVP 
QGIRP WQLTAI EI LAWGLRNM KNFQMAS I TSP SL WECGGERV 
ESVVIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 
FGRKP WGQCTI ERLDRFRCDP YAGKED I VPQL KAS LLSAP PCR 
DIVIEMEDTKPLIASKCLSSMSTALSKMASPATVHLTEKEEEIV 
DWWSKFYASSGEHEKCGQYIQXGYSKLKIYNCELENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRBLPDSVPQECTVRIYIVRGLELQPQDNNGLCDPYIKITIiG 
KKV I E \ DRDH Y I PNTLNP VFGRM YELS CYLPQEKDLKI S VYD YD 
TFTRDEKVGETI IDLENPF\LSRFG\SHCG\IPEEYCVSGVNTW 
RDSLR\PTQ\LLQNVARFKGFPQPILSEDGSRIRYGGRDYSLDE 
FEANKILHQHLGAPEER1ALHILRTQGLVPEHVETRTLHSTFQP 
N I S \RY YLRVI I WNTKDV I LDEKS I TGE EMS D I YVKGW I PGKEE 
NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
S I DQTE FRI PPR \ L 1 1 Q I W \DNDKFS \LDD YLGF P RTLTCRHT I 
HFLQKS PGGNC/RGLDMIPDLKAMNPLKAKTAS LFEQKSMKGWW 
P C YAE KDGAR VMAG KVE MTLE I LNEKEAD ERP AGKGRDE PNMNP 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWVIIGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6023 


102 


916 


S QELGMF VELNNLLNTTPDRAEQGKLTLL CDAKTDGS FLVHHFL 
S FYLKANCKVCFVALIQSFSHYS IVGQKLGVSLTMARERGQLVF 
LEGL / 1 VC SGR\ VFQAQ KE PHPLQFLRE ANAGNLKPL FE F VREA 
LKP VDSGEARWTYP VLLVDDLS VLLS LGMGAVAVLDF IHYCRAT 
VCWELKGNMWLVHDSGDAEDEENDI LLNGLSHQSHL I LRAEGL 
ATGFCRDVHGQLRI LWRRPSQPAVHRDQS FTYQYKIQDKS VSFF 
AKGMSPAVL 


6024 


3 


3260 


FLSFLCYPRFRCLFCLQFAIPASRMEQLNELELLMEKSFWEEAE 
LPAELFQKKWASFPRTVLSTGMDNRYLVLAVNTVQNKEGNCEK 
RLVITASQSLENKELCILRNDWCSVPVEPGDIIHIiEGDCTSDTW 
IIDKDFGYLILYPDMLISGTSIASSIRCMRRAVLSETFRSSDPA 
TRQMLIGTVLHEVFQKAINNSFAPEKLQELAFQTIQEIRHLKEM 
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ID 
NO: 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 




« 




YRLNLSQDEI KQEVEDYLPSFCKWAGDFMHKNTSTDFPQMQLSL "~ 
PSDNS KDNS TCN I E WK PMD IEESIWSPR FGLKG KI D VTVGVK I 
HRG Y KTKYKI MPLELKTGKESNS I EHRS QVVLYTLLS QERRADP 
EAGLLL YLKTGQM YP VPANHLD KRELLKLRNQMAFS L FHR I S KS 
ATRQKTQLASLPQIIEEEKTCKYCSQIGNCALYSRAVEQQMDCS 
S VP I VMLP K I E EETQHLKQTHLE YFS LWCLMLTLES Q S KDNKKN 
HQN I WLMP AS EME KS GS C IGNL I RMEHVK I VCDGQ YLHNFQCKH 
GAIPVTNLMAGDRVIVSGEERSLFALSRGYVKEINMTTVTCLLD 
RNLSVLPESTLFRIiDQEEKNCDIDTPLGNLSKLMENTFVSKKLR 
DL 1 1 DFREPQF IS YLS S VLPHDAKDT VAC I LKGLNKPQRQAMKK 
VLLS KDYTL I VGMPGTG KTTT I CTLVR I L YACGFS VLLTS YTHS 
AVDNILLKLAKFKIGFLRSR\QIQKVHPAIQQFTEHEICRSKSI 
KS\LALLEELYTSQLIDATTCMGINHPIFSRKIFDFCIVDEASQ 
I S Q P I CLGPL FFS RRFVLVGDHQQLP PL VLNRE ARALGMS ES LF 
KRLEQNKSAWQLTVQYRMNSKIMSLSNKLTYEGKLECGSDKVA 
NAVINLRHFKDVKLELEFYADYSDNPWLMGVFEPNNPVCFLNTD 
KVPAP EQVEKGGVSNVTEAKLI VFLTS I FVKAGCS PS D I G I IAP 
YRQQLKI INDLLARS I GMVE VNTVDKYQD\RDKS I VLVS FVRSN 
KDGTVGELLKDWRRLNVAITRAKHKLILLGCVPSLNCYPPLSKL 
LNHLNSEKLI I DL PS REHE S LCH I LGD FQRE 


6025 


3977 


89 


GGFPAQSDHLPPVFPLRSDLLITMSTLYVSPHPDAFPSLRALIA 

ARYGEAGEGPGWGGAHPRICLQPPPTSRTSFPPPRLPALEQGPG 

GLWVWGATAVAQLLWPAGLGGPGGSRAAVLVQQWVSYADTELIP 

AACGATLPALGLRSSAQDPQAVLGALGRALS PLEEWLRLHT YLA 

GEAPTLADLAAVTALL L PFR Y VLD P PARR I WNNVTRW FVT GVRQ 

PEFRAVLGE WL YSGAR PL S HQ PG P EAPALP KTAAQLKKEAKKR 

EKLEKFQQKQKIQQQQPPPGEKKPKPEKREKRDPGVITYDLPTP 

PGE KKD VSGPMP DS YS PRYVE AAW YPWWEQQG F FKPE YGRPNVS 

AANPRGVFMMCIPPPNVTGSLHLGHALTNAIQDSLTRWHRMRGE 

TTLWNPGCDHAGIATQVWEKKLWREQGLSRHQLGREAFLQEVW 

KWKEEKGDRIYHQLKKLGSSLDWDRACFTMDPKLSAAVTEAFVR 

LHEEG 1 1 YRS TRLVNWS CTLNSA I S D I E VDKKELTGRTLLS VPG 

YKEKVE FGVLVS FAYKVQGSDSDEE VWATTR I ETMLGDVAVAV 

HPKDTRYQHLKG KNVI HPFLS RS L P I VFDE FVDMD FGTGAVKI T 

PAHDQNDYEVGQRHGLEAISIMDSRGALINVPPPFLGLPRFEAR 

KAVLVALKERGL FRG I EDNPMWPLCNRSKDWE PLLRPQ W YVR 

CGEMAQAASAAVTRGDLRILPERHQRTWHAWMDNIRE\WCMFPG 

KLWWG\HR\IPAYFVTVSDPAVPPGEDPDGRYWVSGRNEAEARE 

KAAKEFGVSPDKISLQQDEDVLDTWFSSGLFPLSILGWPNQSED 

LSVFYPGTLI^TGHDILFFWVARMVMLGLKLTGRLPFREVYLHA 

IVRDAHGRKMSKSLGNVIDPLDVIYGISLQGLHNQLLNSNLDPS 

EVEKAKEGQKADFPAGIPECGTDALRFGLCAYMSQGRDINLDVN 

R I LGYRHFCNKLWNATKFALRGLGKGF VPS PTSQPGGHESLVDR 

W IRSRLTEAVRLSNQG FQA YD FPAVTTAQ YS FWL YELCD VYLEC 

LKPVLNGVDQVAAECARQTLYTCLDVGLRLLSPFMPFVTEELFQ 

RLPRRMPQAPPSLCVTPYPEPSECSWKDPEAEAALEIiALSITRA 

VRP\LRADYNLHPESGPTCFLEVAD\EATGALASAVSGYVQGPG 

QAQVWAVAEPWGLPAP \QGCAVAIiASDRCS I \HLQLQG\LLDP 

ARE LG \ KLQ \ AKRVEAQ \ RQAQ \ RLR \ERRA\ ASGNP VKVPL\ E 

VQEADEAKLQQTEAELRKVDEAIALFQKML 


6026 ; 


2674 


514 


GPITFLKKKAKMKDMPLRIHVLLGLAITTLVQAVDKKVDCPRLC " 
TCE I RPWFT P RS I YNEAS rVDCNDLGLLTFPARLPANTQ I LLLQ 
TNNIAKIEYSTDFPVNLTGLDLSQNNLSSVTNINGKKMPQLLSV 
YLEENKLTELPEKCLSELSNLQELYINHNLLSTISPGAFIGLHN 
LLRLHLNSNRLQM INS KWFDALPNLEILM IGENPI IR I KDMNFK 
PL INLRSLVIAG INLTE I PDNAL VGLENLE S I SFYDNRL I KVPH 
VALQKWNLKFLDLNKNP INR IRRGDFSNMLHLKELGINNMPEL 
I S IDSLAVDNLPDLRKIEATNNPRLS YIHPNAFFRLPKLESLML 
NSNALSALYHGTIESLPNLKEI S IHSNP I RCDCVIRWMNMNKTN 
IRFMEPDSLFCVDPPEFQGQNVRQVHFRDMMEICLPLIAPESFP 
SNLNVEAGSYVSFHCRATA\EPQPEIYWITPSGQKLLPNT\LTD 
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amino acid 
residue of 
amino acid 
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Predicted end 

nucleotide 

location 

c or re sp ondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid, segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, B= 
Glutamic Acid, F= Phenyl al anine , G=Glycine, 
H=Histidine, Ielsoleucine, K=Lysine, 
L«Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KFYVHSEGTLDINGVTPKEGGLYTCIATNLVGADLKSVMIKVDG 
SFPQDNNGSLNIKIRDIQANSVLVSWKASSKILKSSVKWTAFVK 
TENSHAAQSARIPSDVKVYNLTHLNPSTEYKICIDIPTIYQKNR 
KKC VNVTTKGLH PDQKE YE KNNTTTLMACLGGLLG I IGVI CL I S 
CLSPEMNCDGGHSYVRNYLQKPTFALGELYPPLINLWEAGKEKS 
TSLKVKATVIGLPTNMS 


6027 


5254 


4148 


GGRRAPGRPGRS I KDEEEET V FREWS FS PDPLPVR YYDKDTTK 
PISFYLSSLEELLAWKPRLEDGFNVALEPIACRQPPLSSQRPRT 
LLCHDMMGGYLDDRFIQGSWQTPYAFYHWQCIDVFVYFSHHTV 
TI PPVG WTNTAHRHGVCVLGTFI TEWNEGGRLCEAFLAGDERS Y 
QAVADRLVQIT\RFFRFDGWLINIENSLSLAAVGNMPPFLRYLT 
TQLHRQVPGGLVLWYDSWQSGQLKWQDELNQHNRVFFDSCDGF 
FTN YNWREEHLE RMLGQAGERRADVYVG VD VFARGNWGGR FDT 
DKVGGG FRPRASG P VP PLG PHF LMDL P F P S APQRNDS S CS S QS G 
DPVALRNRCPAPAKLCPH 


6028 


120 


3432 


NCLLLQAKGFHGEIEDIjQQWLTDTERHLLASKPLGGLPETAKEQ 
LNVHME VCAAFEAKEETYKS LMQ KGQQ M LARCPKSAETN I DQD I 
NNLKEKWESVETKLNER\KT\KLEEALNLA\MEFHNSL\QDFIN 
WLTQAEQTLNVASRPSLILDTVLFQIDEHKVFANEVNSHREQII 
ELDKTGTHLKY FS Q KQDWL I KNLL I S VQS R WEKWQRL VERGR 
SLDDARKRAKQFHEAWSKLMEWLEESEKSLDSELEIANDPDKIK 
TQLAQHKE FQKS LG AKHS VYDTTNRTGRS L KE KTS LADDNLKLD 
DMLSELRDKWDTICGKSVERQNKLEEA\LLFSGQFTDALQALID 
WL YRVE PQLAEDQ P VHGD I DLVMNL IDNHKAFQKELGKRTSSVQ 
ALKRS ARE L I EGSRDDS S WVKVQMQELSTR WETVCALS I S KQTR 
LEAALRQAEEFHSWHALLEWLAEAEQTLRFHGVLPDDEDALRT 
LI DQHKE FMKKLEE KRAE LNKATTMGDTVLAI CHPDS I TT I KHW 
ITI IRARFEEVLAWAKQHQQRLASALAGLIAKQELLEALLAWLQ 
WAETTLTDKDKEVIPQEIEEVKALIAEHQTFMEEMTRKQPDVDK 
VTKTYKRRAADPSSLQSHI P VLDKGRAGRKRFPASSLYPSGSQT 
QIETKNPRVNLLVSKWQQVWIiLALERRRKLNDALDRLBELREFA 
NFDFD I WRKK YMRWMNHKKS RVMDF FR R T DTfnnnnTf T td ot? l? t n 
G I LS S KFPTS RLEMSAVAD I FDRDGDG Y ID YYE FVAALH PNKDA 
YKP I TDADK I EDE VTRQ VAKCKCAKRFQVEQ IGDNKYRFFLGNQ 
FGDS QQ LRLVRI LRSTVMVRVGG G WMALDEFLVKNDPCRAKGRT 
NMELREKFILADGASQGMAAFRPRGRRSRPSSRGASPNRSTSVS 
SQAAQAAS PQVPATTTPKI LHPLTRNYGKPWLTNS KMSTPCKAA 
ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGEDSGLITTAA 
ARVRTQFADSKKTPSRPGSRAGSKAGSRASSRRGSDASDFDISE 
IQS VCSDVETVPQTHRPTPRAGSRPSTAKPS KI PTPQRKS PAS K 
LDKSSKR 


G029 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVMENSKVLGESM 
AGI S QNAKTGDL PAFGECVG I AS KALCG LTEAAAQAA YLVGI FD 
PNSQAGHQGLVDP I QFARANQA I QMACQNLVDPGS S P S QVLS AA 
T I VAKHTS ALCNACR IAS S KTANPVAKRHF VQS AKE VANSTANL 
VKTIKALDGDFSEDNRNKCRIATAPLIEAVENLTAFASNPEFVS 
I PAQ I S S EG SQAQ E P I LVS AKPMLES S S YLIRTAR S LAI NP KDP 
P TWS VLAGHSHT VSDS I KS L I TS IRD KAPGQRECDYS I DG INRC 
I RD I EQAS LAAVS QS LATRDD I S VEALQEQLTS WQE I GHL I D P 
I ATAARGEAAQLGHKGTQLAS Y FEPL I LAAVGVAS KI LDHQQQM 
TVLDQTKTLAESALQMLYAAKEGGGNPKAQHTHDAI TEAAQLMK 
EAVDDIMVTLNEAASEVGLVGGMVDAIAEAMSKIjDEGTPPEPKG 
TFVDYQTTWKYSKAIAVTAQEMMTKSVTNPEELGGLASQMTSD 
YGHLAFQGQMAAATAEPEE IGFQIRTRVQDLGHGCI FLVQKAG\ 
AIiQVCP TDS YTKR EL I E CARAVT E KVS L VLSALQAGNKGTQAC I 
TAATAVSGI I ADLDTT I MFATAGTLNAHNSETFADHRENI LKTA 
KALVEDTKLLVSGAASTPDKLAQAAQSSAATITQLAEWKLGAA 
SLGSDDPETQWLINAIKDVAKALSDLISATKGAASKPVDDPSM 
YQLKGAAKWVTNVTS LLICIVKAVEDEATRGTRALBAT I E C I KQ 
ELTVFQSKDVPEKTSS PEES I RMTKGITMATAKAVAAGNS CRQE 
DVIATANLSRKAVSDMLTACKQASFHPDVSDEVRTRALRFGTEC 
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Predicted 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H^Histidine, I«Isoleucine, K=Lysir.e, 
L«Leucine, M=>Methionine, N=Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLG YLDLLEHVLVI LQKPTPELKQQLAAFS KRVAGAVTELIQAA 
EAMKGTEWVDPEDPTVIAETBLLGAAAS IEAAAKKLEQLKPRAK 
PKQADE TLDFEEQ I LEAAKS I AAATSAL VKS AS AAQRELVAQG K 
VG S I P ANAADDGQWSQGL I S AARMVAAATS S LCEAANAS VQGHA 
S E E KL I S S AKQ VAAS TAQLLVAC KVKADQDS E AMRRLQAAGNAV 
KRAS DNL VRAAQKAAFG KADDDDWVKTKFVGGI AQ I IAAQEBM 
LKKERELEEARKKLAQIRQQQYKFLPTELREDEG 


6030 


3 


1777 


F PGRGSP ALQIjE VL I CLG LMGLERALNVLAP I F YRN I VNLLTEN 
APWNSLAWTVTSYVFLKFLQGGGTGSTGFVSNLRTFLWIRVQQF 
TS RRVELLI FSHIiHELSLRWHLGRRTGE VLRI ADRGTSS VTGLL 
SYLVFNVIPTLADIIIGIIYFSMFFNAWFGLIVFLCMSLYLTLT 
IWTEWRTKFRRAMNTQENATRARAVDSLLNFETVKYYNAESYE 
VERYREAI I KYQGLEWKSSASLVLLNQTQNLVIGLGLLAGSLLC 
AYFVTEQKLQ VGDYVLFGT YI I QL YM PLNWFGTYYRM I QTNFID 
MENMFDLLKK\ETEVKDLPGAGPFRFQKGRIEFENVHFSYADGR 
ETLQD VS FT VMPGQTLAL VGPS GAG KS TI LRLL FRF YD I S S GC I 
R I DGQDISQVTQALFRFSHWELCPKDTVLFNDT I ADNI RYGRVT 
AGNDEVEAAAQAAGIHDAIMAFPEGYRTQVGERGLKLSGGEKQR 
VAI ARTILKAPG I ILLDEATSALDTSNERAIQASLAKVCANRTT 
I WAHRLS TWNADQ I L V I KDGCI VERGRHEALLSRGGVYADMW 
QLQQGQEETS EDTKPQTMER 


6031 


160 


1694 


LRMS ENLDKSNVNEAGKS KS NDSEEGLEDAVEGADEALQKAI KS 
DSS S PQRVQRPHSSP PRFVTVEELLETARGVTNMALAHE I WNG 
DFQ I KP VELPENS LKKRVKE I VHKAFWDCLS VQLSEDP PAYDHA 
I KLVGEIKETLLS FLLPGHTRLRNQI TE VLDLDL I KQEAENGAL 
DI S KLAEFI IGMMGTLCAPARDEE VKKLKDI KEI VPLFRE I FS V 
LDLMKVDMANFAISSIRPHLMQQSVEYERKKFQEILERQPNSLD 
FVTQWLEEASEDLMTQKYKHAIjPVGGMAAGSGDMPRLSPVAVQN 
YAYLKLLKWDHLQRPFPETVLMDQSRFHELQLQ\REQLTILGAV 
LLVTFSMAAPGISSQADFAEKLKMIVKILLTDMHLPSFHLKDVL 
TTIGEKVCLEVSSCLSLCGSSPFTTDKETVLKGQIQAVASPDDP 
IRRIMESRILTFLETYLASGHQKPLPTVPGGLSPVQRELEEVAI 
KFARLVNYNKMVFCPYYDAILSKILVRS 


6032 


39 


2415 


AARLCRAQPT KS AWM I RDL S KM YPQTRH P APHQ PAQ P FKFT I S E 
SCDRIKEEFQFLQAQYHSLKLECEKLASEKTEMQRHYVMYYEMS 
YGLN I EMHKQAE I VKRLNAI CAQV I P FLS QEHQQQ WQ AVERAK 
QVTMAELNAI IGQQQLQAQHLSHGHGLPVPLTPHPSGLQPPAI P 
P IGS S AGLLALS SALGGQS HLPI KDEKKHHDNDHQRDRDS I KS S 
SVSPSASFRGAEKHRNSADYSSESKKQKTEEKEIAARYDSDGEK 
SDDNL WDVSNEDPSSPRGSPAHSPRENGLDKTRLLKKDAPISP 
ASIASSSSTPSSKSKELSLNEKSTTPVSKSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVDPLASSLRTPMAVPCPYPTPFGIVPHAG 
MNGELTS PGAAYAGLHNIS PQMSAAAAAAAAAAAYGRS PWGFD 
PHHHMRVP AI P PNLTG I PGGKP AYS FHVS ADGQMQ P VP FPPDAL 
IGPG I PRHARQ INTLNHGE VVCAVTI SNPTRHVYTGGKGCVKVW 
DI SH PGNKS PVS QLDCLNRDN Y I RS CRLLPDGRTL I VGGEAS TL 
SIWDLAAPTPRIKAELTSSAPACYALAISPDSKVCFSCCSDGNI 
AVWDLHNQTLVRQ FQGHTDGASC I D I SNDGTKLWTGGLDNTVRS 
W\ DLREGRQLQQHD / FFTS PVFSLGYCP \TEE WLAVGMENSN\ V 
EVLHWKPDKYQLHLHESCVLSLKFAHCGKWF\VSTGKDNLLNA 
W\RTPYG\ASIF\QSKESSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6033 


39 


2415 


AARLCRAQPTKSAWM IRDLS KMYPQTRHPAPHQPAQPFKFT I S E 
S CDR I KEE FQFLQAQ YHS LKLECEKLAS E KTE MQRH YVM Y Y EMS 
YG LN I EMHKQAE I VKRLNAI CAQVI PFL SQEHQQQ WQAVERAK 
QVTMAELNAI IGQQQLQAQHLSHGHGLPVPLTPHPSGLQPPAI P 
P I GSSAGLLALSSALGGQS HLPI KDEKKHHDNDHQRDRDS I KSS 
S VS PSAS FRGAEKHRNSADYS S ES KKQKTEEKE I AAR YDS DGE K 
SDDNL WDVSNEDPSS PRGSPAHS PRENGLDKTRLLKKDAPI S P 
AS IASSSSTPSSKSKELSLNEKSTTPVSKSNTPTPRTDAPTPGS 
NSTPGLRPVPQKPPGVDPLASSLRTPMAVPCPYPTPFGIVPHAG 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
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sequence 


Amino acid segment containing signal peptide ~ 
(A=Alanine, (^Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H«Hiatidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MNGELTS PGAAYAGLHNI S PQMSAAAAAAAAAAAYGRSP WGFD 
PHHHMR VP AI P PNLTG I PGGKP A YS PHVS ADGQMQ P VP FP PDAL 
IGPGI PRHARQ INTLNHGE WCAVTI SNPTRHVYTGGKGCVKVW 
D I SH PGNKS P VS QLD CLNRDNY I RS CRLLPDGRTL I VGGE AST L 
SIWDLAAPTPRIKAELTSSAPACYALAISPDSKVCFSCCSDGNI 
AVWDLHNQTLVRQ FQGHTDGAS CI DISNDGTKLWTGGLDNTVRS 
W\DLREGRQLQQHD/FFTSPVFSLGYCP\TEEWLAVGMENSN\V 
EVLHVTKPDKYQLHLHESCVLSLKFAHCGKWF\VSTGKDNLLNA 
W\RTPYG\ASIF\QSKESSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6034 


2683 


714 


ESGRRRRLKRRRSPCPGTAGGPGETNPGPGACPRGPREEAAAAM 
E I APQEAPPVPGADGD IEEAPAEAGS PS PAS PPADGRLKAAAKR 
VTFPSDEDIVSGAVEPKDPWRHAQNVTVDEVIGAYKQACQKLNC 
RQIPKLLRQLQEFTDLGHRLDCLDLKGEKLDYKTCEALEEVFKR 
LQFKWDLEQTNLDEDGASALFDM IEY YESATHLNI S FNKH IGT 
RGWQAAAHMMRKTSCLQYL\DARNTPLLDHSAPFVARALRIRSS 
LAVLHLENASLSGRP LMLLATALKMNMNLRELYL \ADNKLNGLQ 
DSAQLGNLLKFNCSLQ I LDLRNNHVLDSGLAYI CEGLKEQRKGL 
VTL\ VLWNNQLTHTGMAFLGMTL PHTQS LETLNLGHNP I GNEGV 
RHLKNGLISNRSVLRLGLASTKLTCEGAVAVAEFIAESPRLLRL 
DLRENE I KTGGLMALSLALKVNHS LLRLDLDRE PKKEAVKSF I E 
TQKALLAEIQNGCKRNLVLAREREEKEQPPQLSASMPETTATEP 
QPDDEPAAGVQNGAPSPAPSPDSDSDSDSDGEEEEEEEGERDET 
PSGAIDTRDTGSSEPQPPPEPPRSGPPLPNGLKPEFALALPPEP 
PPGPEVKGGSCGLEHELSCSKNEKELEELLLEASQESGQETL 


6035 


19 


404 


S VTYLGI I LHKNTGALPAD PVQLI SQTPTPSTKQQLLS FLGMVG 
YFYLWIPGFAILTKPLCKLTKENLADAIDPKSFSHSSFRSLKTA 
LENASTLALPDSSQPF\SLHTAEVQGCWEILTQGLGPLPV 


6036 


1745 


356 


LPDVEKLGRRRGRKMDSVEKGAATSVSNPRGRPSRGRPPKLQRN 
SRGGQGRG VEKP PHLAALILARGGSKGI PLKNI KHLAGVPLIGW 
VLRAALDSGAFQS VWVSTDHDEI ENVAKQFGAQVHRRS SE VSKD 
S STS LDAI I EFLNYHNE VD I VGNI QATS PCLHPTDLQKVAEMIR 
EEG YD S VFS WRRHQ F RWS E I QKG VREVTE PLNLNPAKR P RRQD 
WDGELYENGSFYFAKRHLIEMGYLQGGKMAYYEMRAEHSVDIDV 
D I DWP IAEQRVLR YGYFGKEKLKE I KLLVCNI DGCLTNGH I YVS 
GDQKEIISYDVKDAIGISLLKKSGIEVRLISERACSKQTLSSLK 
LDCKMEVSVSDKLAVVDEWRKEMGLCWKEVAYLGNEVSDEECLK 
RVGLSGAPADACSTAQKAVGYICKCNGGRGA\ IREFAEHI C\LL 
MEKGLINFMPKNRNLAVNIGEKK 


6037 


2936 


1919 


WTS W W M SS VLTI LLFSLQGN KMLN YS AP S AGG YLL PRKP VGT PA 
GGGFPRRHSVTLPSSKFRQNQLLSSLKGEPAPALSSRDSRFRDR 
S FS EGGERLLPTQKQPGGGQVNS S R YKT\EL CR P FEENG ACKYG 
DKCQFAHGIHELRSLTRHPKYKTELCRTFHTIGFCPYGPRCHFI 
HNAEERRALAGARDLS ADRPRLQHS FS FAGFPS AAATAAATGLL 
DSPTSITPPPILSADDLLGSPTLPDGTNNPF\AFSSQELASLFA 
PSMGLPGGGSPTTFLFRPMSESPHMFDSPPSPQDSLSDQEGYLS 
SSSSSHSGSDSPTLDNSRRLPIFSRLSISDD 


603B 


1450 


426 


SSALQEFGTRNHTFG VPLPHRRKQ IIS CNI CQLRFNS DSQAAAH 
YKG TKHAKKL KALEAMKNKQKS VTAKD S AKTTFTS I TTNT I NTS 
S DKTDG TAGTPA IS TTTTVEI RKSS VMTTE ITS KVEKSPTTATG 
NSSCPSTETEEEKAKRLL \ YCSLCKVAVNS ASQLEAHNSGTKHK 
TMLEARNGSGTI KAFPRAGVKGKGPVNKGNTGLQNKTFHCE I CD 
VHVNSETQLKQHISSRRHKDRAAGKPPKPKYSPYNKLQKTAHPL 
GVKLVFS KEPS KPLAPRILPNPLAAAAAAAAVAVS S PFSLRTAP 
AATLFQTS ALP PALLR PAPG P I RTAHT P VL FAPY 


6039 


4073 


1000 


LDE YEARLTLANLDD FE EDNEDDDENRVNQEEKAAK I TEL INKL 
NFLDEAEKDLATVNSNPFDDPDAAELNPFGDPDSEEPITETASP 
lUCTEDSFYNNSYNPFKEVQTPQYLNPFDEPEAFVTIKDSPPQST 
KKKN I RP VDMSK YL YADS S KTEEEELDESNP FYE P KS TPPPNNL 
VNP VQELE TERR VKRKAP AP P VLS PKTGVLNENTVSAGKDLSTS 
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w ocyiiicm. uuuuaininy signax peptide 
{A^Alanine, OCyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine / N=Asparagine, 
P=Proline, Q=Glutamine, RsArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan 7 Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\ "possible nucleotide insertion) 








PKPSPIPSPVLGRKPNASQSLLVWCKEVTKNYRGVKITNFTTSW 
RNGLS FCA I LHHFR PDL I DYKS LN PQD I KE NNKKAYDG FAS IG I 
SRLLEPSDMVLLAIPDKL1VMTYLYQIRAHFSGQELNWQIEEN 
SSKSTYKVGNYETDTNSSVDQEKFYAELSDLKREPELQQPISGA 
VD FLS QDDS VFVND SG VGESE S EHQTPDDHLS PSTAS P YCRRTK 
SDTEPQKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKKRLLKA 
ETLELSDLYVSDKKKDMSPPFICEETDEQKLQTLDIGSNLEKEK 
LENS RS LE CRS D PE S P I KKTS LS PTS KLG Y S YS RDLDLAKKKHA 
SLRQTESDPDADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 

^j-ir>.c^ijnt\uruvjis nxiVi Lnlnnlr c UNKyijaljyUDEERRKQLRER 

ARQL I AEARS GGKMSELPS YGERAAEKLKERS KASGDENDNIE I 

DTNE E I PEG FWGGGDE LTNLENDLDTPEQNS KLVDLKL KKLLE 

VQPQVANSPSSAAQKAVTESSEQDMKSGTEDIiRTERLQKTTERF 

RNP WFSKDS TVRKTQLQS PSQ Y I ENRPEMKRQRS IQEDTKKGN 

EEKAATTKTOPK"3 c l'PnP VT .KTWr3T?wno V cnvin^ur a at mTi-irtwo 
uutuuiA x u j. ojjud v i_u\ xv\ji7 i\jL/o \ayi V VtjCiLiAAIjENEQKQ 

IDTRAALVEKRLRYLMDTGRNTEEEEAMMQEWFMLVNKKNALIR 
RMNQLS LLE KEHD LERR YELLNR E LRAMLA I E D WQKTEAQKRRE 

QLLLDELVALVlJKRDALVRDLDAQEKQAEEEDEHLERTLEQNiCG 
KMAKKE E KCVLQ 


6040 


475 


1052 


PTALMTAPSCAFPVQFRQPSVSGLSQITKSLYISNGVAANNKLM 

ls snq i tm v i nvs ve wntlyed i q ymq vp vads pnsrl cdffd 
piadhihsvemkqgrXtllhcaagvsrsaalclaylmkyhamsl 
ldahtwtkscrpiirpnsgfweqlihyefqlfgkntvhmvsspv 
gmi pd i ye kevrlm i pl 


6041 


2 


3886 


TEKDEKTAHNLENVLIHFWERLSEICVAKISEPEADVESVLGVS 
NLLQVLQKPKGSLKSSKKKNGKVRFADEILESNKENEKCVSSEG 
E K I E CWELTTE PS LTHNS S GLLS PLR KKP LEDLVCKLAD I S IN Y 
VNERKSEQHLRFLSTLLDSFSSSRVFKMLLGDEKQSIVQAKPLE 
IAKLVQKNPAVQFLYQKLIGWLNEDQRKDFGFLVDILYSALRCC 
DNDMERKKVLDDLTKVDLKWNSLLK 1 1 E KACPS S DKHAL VT P WL 
KGDILGEKLVNLADCLCNEDLESRVSSESHFSERWTLIiSLVLSQ 
HVKND YLI GDVYVER 1 1 VRLHETLFKTKKLSEAESSDSS VS FIC 
DVAYNYFSSAKGCLLMPSSEDLLLTLFQLCAQSKEKTHLPDFLI 
CKLKNTWLSGVNLLVHQTDSSYKESTFLHLSALWLKNQVQASSL 
DINSLQVLLSAVDDLLNTLLESEDSYLMGVYIGSVMPNDSEWEK 
MRQSLPMQWLHRPLLEGRLSLNYECFKTDFKEQDIKTLPSHLCT 
S ALLS KM VL I ALRKET VLENNELE K I IAELL YS LQWCEELDNP P 
IFLIGFCEILQKMNITYDNLRVLGNMSGLLQLLFNRSREHGTLW 
S L I X AKL I LS RS I S S DE VKPHYKRKES FFP LTEGNLHT IQS LC P 
FLSKEEKKEFSAQCI PALLGWTKKDLCSTNGGFGHLAI FNSCLQ 
TKS IDDGELLHGILKII ISWKKEHEDIFLFSCNLSEASPEVLGV 
NIEIIRFLSLFLKYCSSPLAESEWDFIMCSMIiAWLETTSENQAL 
YSIPLVQLFACVSCDLACDLSAFFDSTTLDTIGNLPVNLISEWK 

EFFSQGIHSLLLPILVTVTGENKDVSETSFQNAMLKPMCETLTY 
ISKEOLLSHKLPARLVADDKTNT.'PPVT.nTT.T kttt zitdt t t nn^nn 

VQ I AVYHMLYKLMPE LP Q YDQDNL KS YGDEE EE PALS P P AALMS 
LLS IQEDLLENVLGCI P VGQIVTI KPLSEDFCYVLGYLLTWKLI 
LTFFKAASSQLRALYSMYLRKTKSLNKLLYHLFRLMPENPTYAE 
TAVEVPNKD P KTFFTEE LQLS I RE TTMLP YH I PHLACS VYHMTL 
KDL P AMVRLWWNS S E KR V FN I VDRFTS KYVS S VLS FQE I SS VQT 
STQL FNGMT VKARATTRE VMAT YT I ED I VI EL 1 1 QLP S N YPLGS 
I I VE S G KRVG VAVQQ WRNWMLQLS T YLTHQNGS I MEGLALWKNN 
VDKR FEGVEDCM I C FS V IHG FN YS L P KKACRTCKKKFHS A\CL Y 
KWFTSSNKSTCSLCRETFF 


6042 


1306 


253 


MAELAPASPSDIKASVSNGDTTLLCSRRQSCGMNEVRQVSLTYP 
GSPAPSHSLPLQPRSGGSLCPSRAW/PDPHQLFDDTSSAQSRGY 
GAQRAPGGLSYPAASPTPHAAFLADPVSNMAMAYGSSLAAQGKE 
LVDKNIDRFIPITKLKYYFAVDTMYVGRKLGLLFFPYLHQDWEV 
Q YQQDT P VAPR FD VNAP DL Y I PAMAF I TYVLVAGLALGTQDRFS 
PDLLGLQASSALAWLTLEVLAIIJjSLYLVTVNTDLTTIDLVAFL 
GYKYVGM IGGVLMGLLFGKIGYYLVLGWCCVAI FVFMIRTLRLK 
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tr IcUXU L.CU CiJLU 
U I* il» L— JL UW 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
irt=rtj.diune / CBuysteine, u=ASpartic Acid, E= 
Glutamic Acid, F=Phenylalanine / G=Glycine, 

L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V= Valine 
WsTryptophan, Y=Tyrosine, X- Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








I LADAAAEGVP VRGARNQIjRMYLTMAVAAAQPMLM YWLTFHLVR 


6043 


403 


599 


LCLFFPFPCATPVLPLPSLISAL/CLSHLSVSSWFCPCQPPLPC 
PLPPLQNKTAKGSLSTEQSERG 


6044 


793 


412 


KLEMWNFTLISKVKISREVTMIASKFGIGQQVRHSLLGYLGVW 
D IDP VYSLSE PS PDELAVNDELRAAPWYHWMEDDNGLPVHT YL 
AEAQLSSEU}DEHP\EQPSMDELAQTIRKQLQAPRLRN 


6045 


155 


2299 


SPLPQVAAMNYLRRRLSDSNFMANLPNGYMTDLQRPQPPPPPPG 
AHS PGAT PGPGTATAERS SG VAPAAS PAAP S PGS S GGGG FFS S L 
SNAVKQTTAAAAATFS E QVGG G SGGAGRGGAAS R VLL VI DEPHT 
DWAK YFKG KK IHGE I D I KVEQAEFS DLNL VAHANGG FS VDME VL 
RNGVKVVRSLKPDFVLIRQHAFSMARNGDYRSLVIGLQYAGIPS 
VNSLHSVYNFCDKPWVFAQMVRLHKKLGTEEFPLIDQTFYPNHK 
EMLSS\TTYPVWKMGHGTLWGWGKVKVDNQHDFQDIASVVAIjT 
kt yatae pf i dakyd vr vqki gqn ykaymrts vsgnwktntgsa 
MLEQ I AMSDR YKLWVDTCS E I FGGLD I CAVEALHG KDGRDHI I E 
WGSSMPL1GDHQDEDKQLIVELWNKMAQALPRQRQRDASPGR 
GSHGQT P S PG ALPLGRQTS QQPAGPP AQQRP PPQGGP P Q PGPG P 
QRQGPPLQQRPPPQGQQHLSGLGPPAGSPLPQRLPSPTSAPQQP 
ASQAAPPTQGQGRQSRPVAGGPGAPPAARPPASPSPQRQAGPPQ 
ATRQTSVSGPAPPKASGAPPGGQQRQGPPQKPPGPAGPTRQASQ 
AtjfvfK Lisep L lyvPRPSGPGPAGRPKPQLAQKPSQDVPPPATA 
AAGGPPHPQLNKSQSLTNAFNLPEPAPPRPSLSQDEVKAETIRS 
LRKSFASLFSD 


6046 


212 


1075 


EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRWAGPESLPPLPR 
SL1 MDS PRAGTHQGPLDAETEVGADRCTSTAYQEQRPQVEQVGK 
QAP LS PGL P AMGGPG PG P CEDPAG AGGAGAGGS E PL VTVT VQCA 
FTVALRARRGADLSSLRALLGQALPHQ\AQLGQLSYLAPGEDGH 
WVPIPEEE3LQRAWQDAAACPRGLQLQCRGAGGRPVLYQWAQH 
SYSAQGPEDLGFRQGDTVDVLCEVDQAWLEGHCDGRIGIFPKCF 
WPAGPRMSGAPGRLPRSQQGDQP 


6047 


49 


1405 


P VLVTS LRMREADTLRP PQLMEVSAD 1 1 ST VE FNHTGE L LATGD 
KGGRWIFQREPESKNAPHSQGEYDVYSTFQSHEPEFDYLKSLE 
IEEKINKIKWLPQQNAAHSLLSTNDKTIKLWKITERDKRPEGYN 
LKDEEGKLKDLSTVTSLQVPVLKPMDLMVEVS PRR I FANGHTYH 
INSISVNSDCETYMSADDLRINLWHLAITDRSFTP\NIVDIKPA 
NMEDLTEVITASEFHPHHCNLFVYSSSKGSLRLCDMRAAALCDK 
HSKLFEEPEDPSNRSFFSEIIS\SVSDVKFSHSDRYMLTR\DYL 
T VKVW DL \ NME AR P I ET YQVHDYLRS KLCSL YEND CI FDKFE CA 
WNGSDSVIMTGA\YNNFFRMFDRNTKRDVTL\EASRESSKPRAV 
LKPRRVCVGGKRRRDD IS VDS LDFTKKI LHTAWHPAENI IAIAA 
TNNLY I FQDKVNSDMH 


6048 


1 


3194 


GIRTPKFCDSPTSDLEMRNGRGRGKRMRPNSNTPVNETATASDS 
KGTSNSS KTRAGANSKGRRGSQNSSEHRPPAS STS EDVKAS PSS 
ANKRKNKPLS DMELNS S S EDS KGS KRVRTNSMGSATGPL PGT KV 
EPTVLDRNCPS PVLIDCPHPNCNKKYKHINGLKYHQAHAHTDDD 
S KPEADGDS E YGEE P I LHADLGS CNG \ AS VSQK \ G S LS PAR SAT 
P KVRL VE PHSPSPSSKFS TKGLCKKKLSGEGDTDLGALSNDGS D 
DGPSVMDETSNDAFDSLERKCMEKEKCKKPSSLKPEKIPSKSLK 
SARPI/APLAIPPQQIYTFQTATFTAASPGSSSGLTATVAQAMP 
NSPQLKPIQPKPTVMGEPFTVNPALTPAKDKKKKDKKKKESSKE 
L E S PLTPGKVCRAEEGKS P FRES SGNGMKMRfTT .T ,NR Q DPHO Q P 
LAS I KAEADKI YS FTDNAP S PS IGGSSRLENTT PTQPLTPLHW 
TQNGAEASS VKTNS PAYSDI SDAGEDGEGKVDSVKS KDAEQLVK 
EGAKKTLFPPQPQSKDSPYYQGFES YYS PSYAQSS PGALNPS SQ 
AGVESQALKTKRDEEPESIEGKVKNDICEEKKPELSSSSQQPSV 
I QQRPNM YMQ S L YYNQ YAYVP P YGYSDQS YHTHLL S TNTAYRQQ 
YEEQQKRQSLEQQQRGVDKKAEMGLKEREAALKEEWKQKPSIPP 
TLTKAPSLTDLVKSGPGKAKEPGADPAKSVIIPKLDDSSKLPGQ 
APEGLKVKLSDASHLSKEAS EAKTGAECGRQAEMDP ILWYRQBA 
EPRMWTYVYPAKYSDIKSEDERWKEERDRKLKEERSRSKDSVPK 
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to first 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
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amino acid 
sequence 


Amino acid Secntient COnfeaininn a i anal nonfi A**. 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=» Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EDGKESTSSDCKLPTSEESRLGSKEPRPSVHVPVSSPLTQHQSY~" 

IPYMHGYSYSQSYDPNHPSYRSMPAVMMQNYPGSYLPSSYSFSP 

YGSKVSGGEDADKARASPSVTCKSSSESKALDILQQHASHYKSK 

SPTISDKTSQERDRGGCGWGGGGSCSSVGGASGGERSVDRPRT 

SPSQRLMSTHHHHHHLGYSLLPAQYNLPYAAGLSSTAIVASQQG 
STPSLYPPPRR 


6049 


215 


1089 


AMTG VFDRR VPS I RS GDFQAP FQTS AAMHH P S QE S PTL PES SAT 
DSDYYSPTGGAPHGYCSPTSASYG\KALNPYQYQYHGVNGSAGS 
YPAKAYADYS YASS YHQYGGAYNRVPSATNQPEKEVTE PEVRMV 
NGKPKKVRKPRTIYSSFQLAALQRRFQKTQYLALPERAELAASL 
GLTQTQVKIWFQNKRSKIKKIMKNGEMPPEHSPSSSDPMACNSP 
QSPAVWEPQGSSRSLSHHPHAHPPTSNQSPASSYLENSASWYTS 
AASS INSHLP P PGS LQHPLALAS GTL Y 


6050 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLGEPG\GSLGWVL "' 
PNTAMKKKVLLMGKSGSGKTSMRS 1 1 FAN YIARDTRRLGATILD 
RIHSLQINSSLSTYS L VDS VGNTKT FDVEHSHVR FLGNL VLNLW 
DCGGQDTFMEN YFTSQRDNIFRNVE VL I YVFD VESRELEKDMHY 
YQSCLEAI LQNSPDAKI FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLS RP LECS CFRTS I WDETLYKAWS S I VYQL I PNVQQLEMNLRN 
FAEI IEADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI IKQ 
FKLSCSKLAAS FQSMEVRNSNFAAF I D I FTSNTYVMWMSDPS I 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6051 




1718 . 


kglertccameesdsekttekenlgprmdpplgepg\gslgwvl 
pntamkkkvllmgksgs gktsmrs 1 1 fanyi ardtrrlgati ld 
rihslqinsslstyslvdsvgntktfdvehshvrflgnlvlnlw 
dcggqdtfmenyftsqrdnifrnvevliyvfdvesrelekdmhy 

YQSCLEAILQNSPDAKIFCLVHKMDLVQEDQRDLIFKEREEDLR 
RLSRPLECSCFRTSIWDETLYKAWSSIVYQL I PNVQQLEMNLRN 
FAEI I EADEVLLFERATFLVISHYQCKEQRDAHRFEKI SNIIKQ 
FKLSCSKLAAS FQSMEVRNSNFAAF I DI FTSNTYVMWMSDPS I 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6052 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLGEPG\GSLGWVL~ 
PNTAMKKKVLLMGKSGSGKTSMRS I IFANYIARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFLGNL VLNLW 
DCGG QDTFMEN Y FTSQRDN I FRNVEVL I YVFDVE S RELE KDMH Y 
YQS CLEAILQNS PDAKI FCLVHKMDLVQEDQRDL I FKERE EDLR 
RLSRP LECS CFRTS I WDETLYKAWS S I VYQL I PNVQQLEMNLRN 
FAEI IEADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI IKQ 
FKLSCS KLAAS FQSMEVRNSNFAAF I D I FTSNTYVMWMSDPS I 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6053 


201 


1704 


KGTEMNKSRWQS RRRHGRRS HQQN PW FRLRDSEDRSDS RAAQ PA ' 
HDSGHGDDES PS T S S GTAGTS S VPEL PG F YFDPEKKR YFRLL PG 
HNNCNPLTKES I RQKEMES KRLRLLOEEDRRK KT APMns pMn ccm 
LRjCSQLGFLNVTNYCHLAHELRLSCMERKKVQIRSMDPSALASD 
R FNL I LADTNSDRLFT VND VTVGGSKYG I INLQS L KT P TL KVFM 
HENLYFTNRKV\NSVCWASLNHLDSHILLCLMGLAETPGCATLL 
PASLFVNSHPAGIDRPG\MLCSFRIPGAWSCAWSLNIQANNCFS 
TGLSRRVLLTNWTGHRQSFGTNSDVLAQQFALMAPLLFNGCRS 
GEIFAIDLRCGNQGKGWKATRLFHDSAVTSVRILQDEQYLMASD 
MAGKIKLWDLRTTKCVRQYEGHVNEYAYLPLHVHEEEG I LVAVG 
QDCYTRIWSLHDARLLRTIPSPYPASKADIPSVAFSSRLGGSRG 
APGLLMAVGQDLYCYSYS 


6054 


1 


1054 


PPIARLQEFGTSRRHMAAPSGVHLLVRRGSHRIFSSPLNHIYLH 
KQSS SQQRRNFFFRRQRDI SHS I VLPAAVS S AHP VPKH I KKPDY 
VTTGIVPDWGDSIEVKNEDQIQGLHQACQLARHVLLLAGKSLKV 
DMTTEE I DALVHRE 1 1 SHNAYPS PLGYGGF P KS VCTS VNNVL CH 
G I PDSRPLQDGDI INI DVTVYYNG YHGDTSETFLVGNVDECGKK 
L VE VARRCRDEAIAACRAGAP FS VI GNT I S H I THQNG FQVCPHF 
VGHGIGSYFHGHPEIWHHANDSDLPMEEGMAFTIEPIITEGSPE 
FKVLEDAWTWSLD/TSKVSAQFEHTVLITSRGAQILTKLPHEA 
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Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lyaine, 
L=Leucine, M=Methionine, N^Asparagine, 
P°Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6055 


421 


2364 


PPYFLLSFIAWWLYGQSDRTETDISQSAGPPPGTLQCSALHHDP * 

GCANCSRFCRDCSPPACQCHTHVFPGNALNGVQPPELSRTLAIjI 

SSREPPRKKKKSQTETGKERERTSFLTQGGKRFELQHGLAGICM 

TLLITGDSIVSAEAVWDHVTMANRELAFKAGDVIKVLDASNKDW 

WWGQ I DDE EGWFPAS FVRLWVNHEDEVEEGPSDVQNGHLDPNSD 

CLCLGRPLQNRDQMRANVINE IMSTERHYI KHLKDICEGYLKQC 

RKRRDMFSDEQLKVIFGNIEDIYRFQMGFVRDLEKQYNNDDPHL 

S E I GPCFLEHQDG FW I YS E YCNNHLDACMELS KLMKDS R YQHFF 

EACRLLQQM I D I A\ I DG FL LT P VQK I CK YPLQLAELLK YTAQDH 

SDYRYVAAALAVME?NVTQQINERKRRLENIDKIAQWQASVLDWE 

GEDILDRSSELIYTGEMAWIYQP\YGRNQQRVFFLFDHQMVXiCK 

KDLIRRDILYYKGRIDMDKYEWDIEDGRDDDFNVSMKNAFKLH 

NKETEEIHLFFAKKLEEKIRWLRAFREERKMVQEDEKIGFEISE 

NQKRQAAMTVRKVPKQKGVNSARSVPPSYPPPQDPLNHGQYLVP 

\DGIAQSQVFEFTEPKRSQSPFWQNFSRLTPFKK 


6056 


43 


3358 


SGGRGPVRVRSEQLSPSAEQVSQISQISLGRRPLSSLPPPPSRA 
LAPTRAPDTALTIMEVAEVESPLNPSCKIMTFRPSMEEFREFNK 
YLAYMESKGAHRAGIiAKVIPPKEWKPRQCYDDIDNLLIPAPIQQ 
MVTGQSGLFTQYNIQKKAMTVKEFRQLANSGKYCTPRYLDYEDL 
ERKYWKNLTFVAP I YGAD INGS I YDEGVDEWNI ARLNTVLDWE 
EECGISI EG VNTP YLYFGM WKTTFAWHTEDMDLYS IN YLHFGEP 
KS W Y A I P PEHGKRLERLAQG F FP S S SQGCDAFLRHKMTL I S P S V 
LKKYGIPFDKITQEAGEFMITFPYGYHAGFNHGFNCAESTNFAT 
VR W I D YG KVAKLCTCRKDMVKI S MD I F VRKFQPDR YQL W KQGKD 
I YT I DHT KPT PAS TPEVKAW LQRRRKVRKAS RS FQCARS TS KRP 
KADEEEEVSDBVDGAEVPNPDSVTDDLKVSEKSEAAVKLRNTEA 
SSEEESSASRMQVEQNLSDHIKLSGNSCLSTSVTEDIKTEDDKA 
YAYRS VPS ISSEADDS I PIiSTGYEKPEKSDPSELS WPKS PES CS 
SVAESNGVLTEGEESDVESHGNGLEPGEI PAVPSGERNS FKVPS 
I AEGENKTS KS WRH PLSRP PARS P MTLVKQQAP S DEEL P E VLS I 
EEEVEETESWAKPLIHLWQTKPPNFAAEQEYNATVARMKPHCAI 
CTLLMPYHKPDSSNEENDARWETKLDEWTSEGKTKPLIPEMCF 
IYSEENIEYSPPNAFLEEDGTSLLISCAKCCVRVHASCYGIPSH 
EICDGWLCARCKRNAWTAECCLCNLRGGALKQTKNNKWAHVMCA 
VAVPE VRFTNVPERTQ IDVGR I PLQRLKLKCIFCRHRVKRVSGA 
CIQCSYGRCPASFHVTCAHAAGVL\MEPDDWPYWNITCFRHKV 
NPNVKS KACEKVI S VGQTVI TKHRNTR Y YS CR VMAVTSQT F YE V 
MFDDGSFSRDTFPEDIVSRDCLKLGPPAEGEWQVKWPDGKLYG 
AKYFGSNIAHMYQVEFEDGSQIAMKREDIYTLDEELPKRVKARF 
VSAGRCHLGTCQWSLSSPHVSQAQQETYLGFWINSKKSQCNIF 
LSGTY 


£057 


1 


853 


FVARLKEQEGEGGLGPRKEKGRARGRERRRKMQIiTRCCFVFIiVQ 
GSLYLVICGQDDGPPGSEDPERDDHEGQPRPRVPRKRGHISPKS 
R PMANS TLLGLLAP PGEAWG I LGQ PPNRPNHS P P P S AKVKK I FG 
WGDFYSNI KTVALNLLVTGKI VDHGNGTFSVHFQHNATGQGN1 S 
I SLVPPS KAVEFHQEQQ I FI EAKASKI FNC\RMEWEKVE\RGRR 
TSLFTHDPAKICSRDHAQSSATWSCSQPFKWCVYIAFYSTDYR 
LVQKVCPDYNYHSDTPYYPSG 


6058 


1 


986 


H PLPS AS LGliP S VS LG VS LC VRS ALLEAWP MLP KRRRARVGS P 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGIiARSKGFR 
VLD ACSS EATH WM E ETS AEEAVS WQER RMAAAP PGCTP PALLD 
I SWLTES LGAGQPVPVECRHRLEVAGPS KGPLS PAWMPAYACQR 
PTPLTHHNTGItSEALE ILAEAAGFEGSEGRLLTFCRAASVLKAL 
PSPVTTLSQLQGLPHFGEHSSRWQELLEHGVCEEVERVRRSE/ 
RL FTQ I FGVG VKTADR W YREGLRTLDDLREQ P Q KLTQQQKAGE P 
S R E AGP WASLNCTLD P SAS TP 


6059 


2 


3650 


QQD FESLADLTDHRAHRC PGDGDDDPQLS W VAS S P S S KD VAS PT ™ 
QM IGDGCDLGLGEEEGGTG LP YPCQFCDKS F I RLS YLKRHEQIH 
S D KL P FKCT YCSRLFKHKRS RDRH I KLHTGDKKYHCHE CE AAFS 
RSDHLKIHLKTHSSSKPFKCTVCKRGFSSTSSLQSHMQAHKKNK 
EHIiAKSEKEAKkDDFMCDYCEDTFSQTEELEKHVLTRHPQLSEK 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M*=Methionine, N=Asparagine , 
P=Proline, Q«Glut amine, R=Arginine, 
S-Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADLQCIHCPEVFVDENTLLAHIHQAHANQKHKCPMCPE\QFSSV " " 

\EGVYCHLDSHRQPDSSNHSVSPDPVLGSVASMSSATPDSSASV 

BRGSTPDSTLKPLRGQKKMRDDGQGWTKWYSCPYCSKRDFNSL 

AVLE IHLKTIHADKPQQSHTCQI CLDSMPTLYNLNEHVRKLHKN 

HAYPVMQFGNISAFHCNYCPEMFADINSLQEHIRVSHCGPNANP 

SDGNNAFFCNQCSMGFLTESSLTEHIQ\Q\AHCSVGSAKLESPV 

VQPTQS FME VYS CP YCTNS P I FGS I LKLTKH I KENHKN I PLAHS 

KKS KAEQ S P VS S D VE VS S P KRQRLS AS ANS I SNGEYP CNQ CDLK 

FSNFESFQTHLKLHIiELLLRKQACPQCKEDFDSQESLLQHLTVH 

YMTTSTHYVCESCDKQFSSVDD\LQKH\LLDMPHPLCCTHCT\L 

CQEVFDS \ KVS I \QVHLAVKHSNEKKMYRCTACNWDFRKEADLQ 

VHVKHSHLGNPAKAHKCIFCGETFSTEVELQCHITTHSKKYNCK 

FCS KAFHAI I LLE KHLREKHCVFDAATENGTANGVPPMATKKAE 

PADLQGMLLKNPEAPNSHEASEDDVDASEPMYGCDICGAAYTME 

VLLQNHRLRDHNIRPGEDDGSRKKAEFIKGSHKCNVCSRTFFSE 

NGLREHLQTHRGPAKHYMCPICGERFPSLLTLTEHKVTHSKSLD 

TGTCRI CKMPLQSEEE F I EHCQMHPDLRNSLTGFRCWCMQTVT 

STLELKI HGT FHMQKLAGS S AAS S PNGQGLQ KL YKCAL CLKEFR 

SKQDLVKLDVNGLPYGLCAGCMARSANGQVGGLAPPEPADRPCA 

GLRCPECSVKFESAEDLESHMQVDHRDLTPETSGPRKGTQTSPV 

PRKKTYQC I KCQMTFENE RE I Q IHVANHM I E EG I NHEC KL CNQM 

FDS P AKLLCHL I EHS FEGMGGTFKCP VCFTVFVQANKLQQH I FA 

VHGQEDKIYDCSQCPQKFFFQTELQNHTMSQHAQ 


6060 


2145 


202 


S YE I VGKNKL E VNHS QL KAL CKCS L PS RLL PLGENLPLLD RG FR 
KEPRSRGSRERDNMLHLHHSCLCFRSWLPAMLAVLLSLAPSASS 
DISASRPNILLLMADDLGIGDIGCYGNNTMRTPNIDRLAEDGVK 
LTQH I S AASLCT P S RAAFLTGR Y P VRSGMVS S I G YRVLQWTGAS 
GGLPTNETTFAK I L E EKG YATGL I GKWHLGLNC E S ASDKCHHP L 
HHGFDHFYGMPFSLMGDCARWELS E KRVNLEQKLNFLFQVLALV 
ALTLVAGKLTHL I P VS WMP V I WS ALS AVUjLAS S Y FVGAL I VHA 
DCFLMRNHTITEQPMCFQRTTPLILQEVASFLKRNKHGPFLLFV 
SFLHVHIPLITMENFLGKSLHGLYGDNVKEMDWMVGRILDTLDV 
EGLSNSTLIYFTSDHGGSLENQIjGNTQYGGWNGIYKGGKGMGGW 
EGGIRVPGIFRWPGVLPAGRVIGEPTSLMDVFPTWRLAGSEVP 
QDRV I DG QDLLP LLLGTAQHSDHE FLMH YCERFLHAARWHQRDR 
GTMWKVH FVTP VFQ P EGAG AC YGR KVCP CFGEKWHHDPP LL F0 
LS RDPSETHILTPASEPVFYQVMER \ VQQAVWEHQRTLSPVPLQ 
LDRLGNI WRPWLQPCCGPFPLCWCLREDDPQ 


6061 


110 


1330 


MNIHMKRKTIKNINTFENRMLMLDGMPAVRVKTELLESEQGSPN 
VHNYPDMEAVPLLLNNVKGEPPEDSLSVDHFQTQTEPVDLSINK 
ARTSPTAVSSSPVSMTASASSPSSTSTSSSSSSRLASSPTVITS 
VSSASSSSTVLTPGPLVASASGVGGQQFLHIIHPVPPSSPMNLQ 
SNKLSHVHRIPVWQSVPVVYTAVRSPGNW^NTIVVPLLEDGRG 
HGKAQMDPRGLSPRQSKSDSDDDDLPNVTLDSVNETGSTALSIA 
RAVQEVHPS PVSRVRGNRMNNQKFPCS I S PFSIESTRRQRTVLN 
P PDSRKTAYS TDCD F \ EGLQQKL YTKS S S PGRVHRRTHTGE KP Y 
KCTWEGCTWKFARS D E LTRHYRKHTGVKP FKCADCDRS FS RSDH 
LALHRRRHMLV 


6062 


71 


1079 


ETMAKNGPENCEDCHILNAEAFKSKKICKSLKICGLVFGILALT 
LIVLFWGSKMFWPEVPKKAYDMEHTFYSNGEKKKIYMEIDPVTR 
TE I FRSGNGTDETLEVHDFKNG YTG I YFVGLQKCFI KTQI KVI P 
EFSEPEEEIDENEEITTTFFEQSVIWVPAEKPIENRDFLKNSKI 
LEICDNVTMYW\INPTL\ISGTFAKQLHHNFAFIILVSELQDFE 
EEGE DLHF PANEKKG I EQNEQWWPQ VKVE KTRHARQ ASEEELP 
IND YTENG IEFDPMLDERG YCCI YCRRGNR YCRRVCE PLLGYYP 
YPYCYQGGRVICRVIMPCNWWVARMLGRV 


6063 


71 


1079 


BTMAKNGPEWCEDCHILNAEAFKSKKICKSLKICGLVFGILALT 
IiIVLFWGSKHFWPEVPKKAYDMEHTFYSNGEKKKIYMEIDPVTR 
TE I FRSGNGTDETLEVHDFKNGYTGI YFVGLQKCFIKTQIKVIP 
EFS EPEEE IDENEE ITTTFFEQS VI WVPAEKP I ENRDFLKNS KI 
LEICDNVTMYW\ INPTL\ ISGTFAKQLHHNFAFI ILVSELQDFE 
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AltlinO aCid Seament POnt-a-i r»-i nrr a-inrml nonl-i'/Q*. — 

(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Ijysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, +oStop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EEGEDLHF P ANE KKG I EQNEQW WPQVKVEKTRHARQAS EE EL P 
INDYTENGIEFDPMLDERGYCCIYCRRGNRYCRRVCEPLLGYYP 
YPYCYQGGRVICRVIMPCWWWVARMLGRV 


6064 


913 


311 


NLPQSLPRPTEHS P P YS LEKMTDLVAVWDVALSDGVHK I EFEHG 
TTSGKRWYVDGKEEIRKEWMFKLVGKETFYVGAAKTKATINID 
AISGFAYEYTLEINGKSLKKYMEDRSKTTNTWVLHMDGENFRIV 
LE KDAMD VWCNGKKLETAGE F VDDGTETHFS I GTH \ ACY I KAV\ 
SSG\KRKEGIIHTLIVDNREIPE1*AS 


6065 


1153 


641 


MS VRVARVAW VRGLGAS YRRGAS S FPVP P PGAQGVAE LLRDATG 
AEEEAPWAATERRMPGQCSVLLFPGQGSQWGMGRGLLNYPRVR 
ELYAAARRVLGYDIiLELSLHGPQETLDRTVHCQPAIFVASLAAV 
EKLHHLQPSVI ENCVAAAGFS VGEFAALVFAGAMEFAEG 


6066 


68 


3470 


VKENMP ATRKPMR YGHTEGHTE VC FDDS G S F I VTCX3S DGD VR I W 
EDLDDDDPKFINVGEKAYSCALKSGKLVTAVSNNTIQVHTFPEG 
VPDG I LTR FTTNANHWFNGDGT K I AAGS S D \ FLVKI VDVMDS S 
QQ KT FRGHDAP VLS L S FDP KD I F LAS AS CDG S VR VWQ I S DQTCA 
I S W P LLQKCND VE NAKS I CRLAWQ P KS G KLLA I P VEKS VKL YRR 
ES WSHQFDLSDNFISQTLNI VTWS PCGQYLAAGS INGL 1 1 VWNV 
ETKDCMERVKHEKGYAICGLAWHPTCGRISYTDAEGNLGLLENV 
CDPSGKTSSSKVSSRVEKDYNDLFDGDDMSNAGDFLNDNAVEIP 
SFSKGIINDDEDDEDLMMASGRPRQRSHILEDDENSVDISMLKT 
GSSLLKEEEEDGQEGSIHNLPLVTSQRPFYDGPMPTPRQKPFQS 
GSTPLHLTHRFMVWNS IGIIRCYNDEQDNAIDVEFHDTSIHHAT 
HLSNTLNYT I ADLSHEAI LLACESTDELAS KLHCLHFSS WD S S K 
EWI I DLPQNED I EAICLGQGWAAAATS ALLLRLFTIGGVQKEVF 
SLAGPWSMAGHGEQLFIVYHRGTGFDGDQCLGVQLLELGKKKK 
QILHGDPLPLTRKSYLAWIGFSAEGTPCYVDSEGIVRMLNRGLG 
NTWTP I CNTREHC KG KS DH YW WG I HENPQQLR CIPCKGSRFPP 
TLPRPAVAILSFKLPYCOIATEKGnMFPOT?iORC!VT'5 , mjwT hvt a 

KNGYEYEESTKNQATKEQQELLMKMLALSCKLEREFRCVELADL 
MTQNAVNLAI KYAS RSRKL I LAQKLSELAVEKAAELTATQVEEE 
EEEEDFRKKLNAGYSOTATEWSQPRFRNQVEEDAEDSGEADDEE 
K PE IHKPGQNS FS KS TNS S D VS AKS GAVT FSSQGR VNP FKVS AS 
S KEPAMSMNSARSTNI LDNMGKSSKKS'PATiSRTTNNFK'q P T t if d 
LIPKPKPKQASAASYFQKRNSQTNKTEEVKEENLKNVLSETPAI 
CPPQNTENQRPKTGFQMWLEENRSNILSDNPDFSDEADIIKEGM 
IRFRVLSTEERKVWANKAKGETASEGTEAKKRKRWDESDETEN 
QEEKAKENLNLSKKQKPLDFSTNQKLSAFAFKQE 


6067 


858 


321 


LPWQRLGVLLSRGKMAVTGWLESLRTAQKTALLQDGRRKVHYLF 
PDGKE MAE E YDE KTS ELLVRKWRVKSALGAMGQWQLEVGDPAPL 
GAGNLGPELIKESNANPIFMRKDTKMSFQWRIRNLPYPKDVYSV 
SVDQKERCI IVRTTNKKYYKKFS I PDLDRHQLPLDDALLSFA\T 
PTAP 


6068 


13 


1730 


GSKMADLANEEKPAIAPPVFVFQKDKGQKSPAEQKNLSDSGEEP 
RGEAEAPHHGTGHPESAGEHALEPPAPAGASASTPPPPAPEAQL 
PPFPRELAGRSAGGSSPEGGEDSDREDGNYCPPVKRERTSSLTQ 
FPPSQSEERSSGFRLKPPTLIHGQAPSAGLPSQKPKEQQRSVLR 
PAVLOAPOP KALSOT VPS SGTNGVQ T .pa nPTfi &\ma » q nn<r asm 

RSPSEAADEVCALEEKEPQKNESSNASEEEACEKKDPATQQAFV 
FGQNLRDRVKLINESVDEADMENAGHPSADTPTATNYFLQYISS 
S LENS TNS ADAS SNKF VFGQNMS ER VLS P P KLNEVS S DANRENA 
AAESGSES S SQEATPE KES LAESAAAYTKATARKCLLEKVE VIT 
GEEAESNVLQMQCKLFVFDKTSQSWVERGRGLLRLNDMASTDDG 
TLQSRLSDAGPRGSLR\LILNTKLWAQMQIDKASEK\SIRITAM 
DNEDQG VK VFL r S AS S KDTGQ VYAALHHR I LALRS RVE QEQEAK 

MPAPEPGAAPSNEEDDSDDDDVLAPSGATAAGAGDEGDGQTTGS 
T 


6069 


583' 


27 


PTRPGQAGSS SAMAAQRLGKRVLSKLQSPSRARGPGGS PGGLQK 
RHAR VT VKYDRRELQRRLDVEKW I DGRLE EL YRGM EADMPD E IN 
IDELLELESEEERSRKIQGLLKSCGKPVEDFIQELLAKLQGLHR 
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Amino acid segment containing signal peptide 
<A=Alanine, ^Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I«Ieoleucine, K=Lysine, 
L»Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion} 








Q\PGLRQPSPSP\DGQPSAPFQGPGARTASPLTLLALFPGPPER 
RPALLCVLSCI 


6070 


478 


858 


I RVTVDGE FLHY I FP LQFLDS PE W / R FTE THRGRHF \ Q VTLTAE " 

TDCRYVSWRRKKLYLLFAQHRYISRLFSVLIGSDIADKLYALND 

RVYIGKRYHYDIRLPNFYQMSTPEIRRSPLTQHFQNSRRYW 


6071 


2 


1654 


HEARTKGNMALARP \ VRLFSLVTRLLLAPRRGLTVRSPDE PLP V '" 
VRIPVALQRQLEQRQSRRRNLPRPVLVRPGPLLVSARRPELNQP 
ARLTLGRWERAPLASQGWKSRRARRDHFSIERAQQEAPAVRKLS 
SKGSFADLGAWKPRVLHALQE\AAPEWQ\PTTVQSSTIPSLLR 
GRHVVCAAETGS G KTLS YLL PLLQRLLG\HP SLDS LP I P APRGL 
VLVPSRELAQQVRAVAQPLGRSLGLLVRDLEGGHGMRRIRLQLS 
RQPSADVLVATPGALWKALKSRLISLEQLSFLVLDEADTLLDES 
FLELVDYILE KSH I AEGPADLEDP FNPKAQLVLVGATFPEGVGQ 
LLNKVAS PDAVTT I TSSKLHCI MPHVKQTFLRLKGADKVAELVH 
I LKHRDRAERTGP S GTVLVFCNSS S TVNWLG Y I LDDHK I QHLRL 
QGQMP ALMRVG I FQS FQ KS S RD I LLCTD IAS RGLD STG VEL WN 
YDFPPTLQDYI HRAGRVGR VG S E VPGTV I S FVTH P WDVS L VQK I 
ELAARRRRSLPGLASSVKEPLPQAT 


6072 


1 


742 


KMERTEMMPTINSQLEFKSKPFPLVSSSRWLVKRGELTAYVEDT 
VLFS RRTS KQQ VYF FLFND VL 1 1 TKKKS EES YNVND YS LRDQLL 
VESCDNEELNSS PG KNS S TML YS RQS S ASHL FTLTVLSNHANEK 
VEMLLGAETQSERARWITALGHSSGKPPADRTSLTQVEIVRSFT 
AKQPDELSLQVADWLI\YQRVSDGWYEGER\LRIX3ERGWFPME 
CAKE I TCQAT I DKNVERMGRLLGLETNV 


6073 


620 


860 


PCRRGLARPLSRRPG/SILVHCAVGVSRSATLVI1AYI1MLYHHLT 
L VEA I KKVKDHRG I I PNRGFLRQLLALDRRLRQGLEA 


6074 


168 


1110 


P GARCMATELQC PD S M PCHNQQVNS ASTPS PEQL RPGDL I LDHA " 
GGNRAS RAKVI LLTG YAHS SLPAELDS GACGG S S LNS EGNS G S G 
D S S S YDAPAGNS FLEDCELSRQ I GAQLKLLPMNDQ I RELQT 1 1 R 
D KTAS RGDFMFS ADRLI RLWE EGLNQL P YKECMVTTPTG YK YE 
GVKFEKGNCGVSIMRSGEAMEQGLRDCCRSIRIGKILIQSDEET 
QRAKVYYAKFPPDIYRRKVLLMYPILQTG\NTVIEAVKVIjIEHG 
VQPSVI ILLSLFSTPHGAKSI IQEFPEITILTTEVHPVAPTHFG 
QKYFGTD 


6075 


320 


1091 


PPTCQPQEVEHH\YGYVPIU3NKTLPSRCHQCVIVSSSSHLLGT 
KLGPE I E RAE CT I RMNDAP TTG YS ADVGNKTTYR WAHS S VFRV 
LRRPQEFVNRTPETVFIFWGPPSKMQKPQGSLVRVIQRAGLVFP 
NMEAYAVSPGRMRQFDDLFRGETGKDREKSHSWLSTGWFTMVIA 
VELCDHVHVYGMVPPNYCSQRPRLQRMPYHYYEPKGPDECVTYI 
QNEHSRKGNHHRFITEKRVFSSWAQLYGITFSHPSWT 


| 6076 


1721 


107 


HPSPTEAPRVQHLTMDCTWRILFLVAAATGTHAQVQLVQSGAEV " 
KKPGASVKVSCKVSGYTLTELSMHWVRQAPGKGLEWMGAFDPED 
GETIYAQKFQGRVTMTEDTSTDTAYMELSSLRSEDTAVYYCATD 
HGDYAFD I WGQGTMVTVSSAPTKAPDVF P I ISGCRHPKDNSPW 
LACLITGYHPTSV\TVTWYMGTQSQA\QRTFPEIQRRDSYYMTS 
SQLSTPLQQ WRQGE YKCWQHTASKSKKE I FRWPES P KAQASSV 
PTAQ PQAEG S LAKATTAPATTRNTGRGG E E KKKE KE KE EQEERE 
T KTPECP SHTQ P LGVYL LTPAVQDLWLRD KAT FTC FWG S DLKD 
AHLTWEVAGKVPTGGVEEGLLERHSNGSQSQHSRLTLPRSLWNA 
GTS VTCTLNHP S LP PQRLMALRE PAAQAP VKLSLNLLAS S D PP E 
A\ AS WLL CE VSG FS P PN I LLMWLEDHGE VNTSGFAPAR PLP KP \ 
RSTTFWA\WSVLRVPAPPSPQPATYTCWSHEDSRTLLNASRSL 
EVSYVTDHGPMK 


6077 


3687 


1268 


LLPDMNLQPIFWIGLISSVCCVFAQTDENRCLKANAKSCGECIQ 
AGPNCGWCTNSTFLQEGMPTSARCDDLEALKKKGCPPDDIENPR 
GSKDIKKNKNVTNRSKGTAEKLKPEDITQIQPQQLVLRLRSGEP 
QTFTLKFKRAEDYPIDLYYLM\DLSYSMKDDLENVKSLGTDLMN 
EMRR I TSDFR I G FGS FVE KTVMP YI STTPAKLRNPCTS EQNCTS 
PFS YKNVLSLTNKGEVFNELVGKQR ISGNLDS PEGGFDAI MQVA 
VCGS L I G WRNVTRLL VFS TDAG FH FAGDG KLGG I VLPNDGQCHL 
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ID 
NO: 


rl CU1L> LCU 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rxcdictca ciiu 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
HaHistidine, I»Isoleucine, K=Lysine, 
LaLeucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ENNM YTMSH YYD YP S I AHL V0_ KLS ENN I QT I FAVTE E FQPVYKE 
LKNLI P KS AVGTLS ANS SNVI QL 1 1 DAYNSLS SE VI LENGKLSE 
uvi j.o iyo i \v-iu.Nvjvj>iufi wi>iiJoj\is.Uolv ±o±\jUei Vijrji/li^l IcjviK. 
CPKKDSDSFKIRPLGFTEEVEVILQYICECECQSEGIPESPKCH 
EGNGTFECGACRCNEGRVGRHCECSTDEVNSEDIGCFTARKENQ 
FQKSASNHGRVPSAGQCVCRKRDNTNEIYSGKFCECDNFNCDRS 
NGL I CGGNGVC KCRVCE CNPNYTGS ACD CS LDTSTCEASNGQ I C 
NGRGICECGVCKCTDPKFQGQTCEMCQTCLGVCAEHKECVQCRA 
FNKGEKKDTCTQECSYFNITKVESRDKLPQPVQPDPVSHCKEKD 
VDDCWFYFTYSVNGNNEVMVHWENPECPTGPDI IP IVAGWAG 
I VLIGLALLL I WKLLMI IHDRRE FAKFEKEKMNAKWDTGENP I Y 
KS AVTTWNP KYEGK 


Grin « 


±4 £. b 


i 1 Q(1 


a 1 ilu V WciiLE Ji.DL I CP I CCSLr DDPRVLPCSHNFCKKCLEGILE 
GS VRNSLWRP VPFKCPTCRKKTFS YWEL I PLQVNYSLKG I VE KY 
NK I KI S PKMP VCKGH \ LGQPLNI F \ CL \ TDMQ LDh/CGZ C\ ATR 
GEHTKHVFCS I EDAYAQERDAFE SLFQS FETWRRGDALSRLDTL 
ETSKRKSLQLLTKDSDKVKEFFEKLQHTLDQKKNEILSDFETMK 
LAVMQAYDPE INKLNTI LQEQRMAFNIAEAFKDVSE P I VFLQQM 
QEFREKIKYIKETPLPPSNLPASPLMKNFDTSQWEDIKLVDVDK 
LSLPQDTGTFISKIPWSFYKLFLLILLLGLVIVFGPTMFLEWSL 
FDDLATWKGCLSNFSS YLTKTADFI EQSVFYWEQVTDG FF r FNE 
RFKNFTLWLNNVAEFVCKYKLL 


6079 


1586 


141 


ATARDLGCARR I DRWMESTPS RGLNRVHLQCRNLQE FLGGLSP 
GVLDRLYGHPATCLAVFRELPSLAKNWVMRMLFLEQPLPQAAVA 

NLRIALLGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWEWL 
HFMVGSPSAAVSQDLAQLIiSQAGLMKSTEPGEPPCITSAGFQFL 
LLDTPAQLW Y FMLQ YLQTAQ SRGMDLVE I LS F L FQLS FS TLGKD 
YSVEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYPT/RALAINL 
SSGVSGAGGTVHQPGFIV\VETNYRLYAYTESELQIALIALFSE 
MLYPFP\NMW\ARVTR\ES VQQAIASGITAQQI IHFLRTRAHP 
VMLKQTPVLPPTITDQIR.LWELERDRLRFTEGVLYNQFLSQVDF 
ELL\LAHAPKLGVLVFE /NTPAKRLMVVTPAGHSDVKRFWKRQK 
HSS 


6080 


X 


i i on 

j. ray 


TLRQQ CLDSG VL FKDPE FPAC PSALG YKDLGPGS PQTQG 1 1 WKR 
PTELCPSPQFIVGGATRTDICQGGLGDCWLLAAIASLTLNEELL 
YR WPRDQDFQENYAG IFHFQPLCPPSP \ FWQ YGE WVEWI DDR 
LPTKNGOLLFLHSEQGNEFWSALLEKAYAKLNGCYEALAGGSTV 
EGFEDFTGGI SEF YDLKKP PANL YQ I IRKALCAGSLLGCS IDVY 
SAAEAEAITSQKLVKSHAYSVTGVEEVNFQGHPEKLIRLRNPWG 

Q FSRLE I CNLSPDS LS S E E VH KWNLVLFNGHWTRGS TAGGCQNY 
PGSS 


6081 


3 


865 


EMLPLLLPLPLLWA/GALAQDARFRLEMPESVTVQEGLCIFVHC 
S VF YLE YGWKDS TPAYGHW FREG VSVDQETPVATNNSTQKVQ KE 
TQGRFHLLGDPSRNNCSLSIRDARRRDNGSYFFWVARGRTKFSY 
KYSPLSVYVTALTHRPDILIPEFLKSGHPSNLTCSVPWVCEQGT 
PPIFSWMSAAPTSLGPRTLHSSVLTI I PRPQDHGTNLI CQVTFP 

Vj/\Lj V i. l CiK.1 lyiio VonlMbul V £■£< V V V Li/Vv V VAViVlljJLil-jdjL.ljl 

I LS FHKKKAVRAVE VE ENVYAVMG 


6082 


283 


1288 


EARSPGPTQTRTAPGLAAPGLAQPAALRLLLSR P PSAAMDGDGD 
PESVGQPEEASPEEQPEEASAEEERPEDQQEEEAAAAA\Y\LDE 
LPEPLLA/ LRVLAALPRHE \LVQACR \LVCLRWKELVDGAPLWL 
LKCQQEGLVPEGGVEEERDHWQQFYFLSKRRRNLLRNPCGEEDL 
EGWCDVEHGGDGWRVEELPGDSGVEFTHDESVKKYFASSFEWCR 
KAQVIDLQAEGYWEELLDTTQPAIWKDWYSGRSDAGCLYELTV 
KLLSEHENVLAEFSSGQVAVPQDSDGGGWMEISHTFTDYGPGVR 
FVRFEHGGQDSVYWKGWFGARVTNSSVWVEP 


6083 


1865 


309 


KOWCAERRGLGMSLADEIiLADLEEAAEEEEGGSYGEEEEEPAIE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
. amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenyl alanine, G»Glycine, 
H=Histidine, I=Isoleucine, K=*Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q«=Glutaraine, R=Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DVQEETQLDLSGDSVKTIAKLWDSKMFAEIMMKIEEYISKQAKA 
SEVMGPVEAAPEYRVIVDANNLTVEIENELNIIHKFIRDKYSKR 
F PELES LVPNALD Y I RT VKE LGNS LD KCKNNENLQQ I LTNAT I M 
WSVTASTTQGQOLSEEELERLEEACDMALELNASKHRIYEYVE 
S RMS FI APNLS III GAS TAAK I MGVAGGLTNLS KM P ACN IMLLG 
AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPPIPPPFSVAP\DL 
RRKAARLVAAKCTLAARVDSFHESTEGKVGYELKDEIERKFDKW 
QE PP P VKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTE IR \ KQ 
ANRMSFGEIEEDAYQEDLGFSLGHLGKSGSGRVRQTQVNEATKA 
R I SKTLQRTIjQKQS WYGGKSTI RDRSS GTAS S VAFT PLQGLE I 
VNPQAAEKKVAEANQKYFSSMAEFLKVKGEKSGLMST 


6084 


1865 


309 


KQWCAERRGLGMSLADELLADLEEAAEEEEGGSYGEEEEEPAIE ' 
DVQEETQLDLSGDS VKTI AKLWDS KMFAE I MMKIEEY I S KQAKA 
S E VMGPVEAAPE YRV I VDANNLTVE I ENE LN I IHKFIRDKYSKR 
FPELESLVPNALDYIRTVKELGNSLDKCKNNENLQQILTNATIM 
WSVTASTTQGQQLSEEELERLEEACDMALELNASKHRIYEYVE 
SRMS FI APNLS 1 1 IGASTAAKIMGVAGGLTNLS KMPACNI MLLG 
AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPPIPPPFSVAP\DL 
RRKAARLVAAKCTLAARVDSFHES TEGKVG YELKDE I ERKFDKW 
QE P P P VKQVKPL PAPLDGQRKKRGGRRYRKM KERLGLT E I R \ KQ 
ANRMSFGE I EEDAYQEDLGFSLGHLGKSGSGRVRQTQVNEATKA 
R I S KTLQRTLQKQS WYGGKST I RDRS SGTAS S VAFTP LQGLE I 
VNPQAAEKKVAEANQKYFSSMAEFLKVKGEKSGLMST 


6085 


2 


1456 


SGPRSFQGNRAVGRISLGGKRNPEVTLLPGVSSERVRRWRRARV 
GVARVKPGNPWKPSPATQVPR/VPAQVYLPGRGPPLREGEELVM 
DEEA Y VL YHRAQTGAP CL S FD I VRDHLGDNRTEL PLTL YLCAGT 
QAESAQSNRLMMLRMHNLHGTKPPPSEGSDEEEEEEDEEDEEER 
KPQLELAMVPHYGGINRVRVS WLGEE PVAGVWS EKGQVEVFALR 
RLLQVVEEPQALAAFLRDEQAQMKPI FSFAGHMGEGFALDWS PR 
VTGRLLTGDCQKNIHLWTPTDGGSWHVDQRPFVGHTRSVEDLQW 
S P TENTVFAS CS ADAS I R I WDI RAAPS KACMLTTATAHDGDVNV 
ISWSRREPFLLSGGDDGALKIWDLRQFKSGSPVATFKQHVAPVT 
S VE WHPQDSGVFAAS GADHQ I TQWDLG / IVERDPEAGDVEADPG 

LADL PQQLLFVHQGE TELKELHWHPQCPGLLVSTALSGFT I FRT 
ISV 


6086 


2419 


1357 


GAATQHGGAMNLL P CNPHGNGL LYAG FNQDHGCFACGMENGFR V 
YNTDPLKEKEKQEFLEGGVGHVEMLFRCNYLALVGGGKKPKYPP 
NKVMIWDDLKKKTVIEIEFSTEVKAVKLRR\DKIVVVLDSMIKV 
FTFTHNP \HQLHVFE \TCYNPKGLCVLCPNSNNSLIiAFPGTHTG 
HVQLVDLAS TE KP P VDI PAHEG VLSC I ALNLQGTR I ATAS EKGT 
L I R I FDTSS GHL I QELRRGS QAANI Y C I NFNQDAS L I CVSSDHG 
TVHIFAAEDPKRNKQSSLASASFLPKYFSSKWSFSKFQVPSGSP 
CICAFGTEPNAVIAICADGSYYKFLFNPKGECIRDVYAQFLEMT 
DDKL 


6087 


476 


1877 


QNS QRTGL P I T I FS RS FPLLTG S DLCENMP CTCTWRNWRQW I RP 
LVAV I YL VS I WAVPLCVWE LQKLE VG I HTKAWF I AG I FLLLT I 
PISLWVILQHLVHYTQPELQKPIIRILWMVP1YSLDSWIALKYP 
GI AI YVDTCRECYEAYVI YNFMGFLTNYLTNRYPNLVL I LEAKD 
QQKHFPPLCCCPPWAMGEVLLFRCKLGVLQYTVVRPFTTIVALI 
CELLGIYDEGNFSFSNAWTYLVI INNMSQLFAMYCLLLFYKVLK 
EELSP IQP VG KFLCVKL WFVS FWQA W I ALL VKVG VI S E KHTW 
EWQTVEAVATGLQDFI I CIEMFLAAIA\HHYTFS YKPYVQEAEE 
GS C FDSFLAMWD VSD I RDDIS EQ VRHVGRT VRGHPR KKL FPEDQ 
DQNEHTS LLSS SS QDAI S IAS S M P PS PMGHYQG FGHT VTPQTTP 
TTAKISDEILSDTIGEKKEPSDKSVDS 


6088 


1684 


689 


GASGLVRLLQQGHRCLLAPVAPKLVP PVRGVKKGFRAAFRFQKE 
LE RQRLLRC P P PPVRRS E KPNWD YHAE IQAFGHRLQENFS LDLL 
KTAF VNS CY I KS EEAKRQQLG I EKEAVLLNLKSNQELS EQGTS F 
SQTCLTQFLEDEYPDMPTEGIKNLVDFLTGEEWCHVARNLAVE 
QLTLSEEFPVPPAVLQQTFFAVIGALLQSSGPERTALFIRDFLI 
TQMTGKELFEMWKIINPMGLLVEELKKRNVSAPESRLTRQSG\A 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
HeHistidine, I«Isoleucine , K=Liysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PTALPLYFVGLYCDKKLIAEGPGETVLVAEEEAARVALRKLYGF 
TENRRPWNYSKPKETLRAEKS ITAS 


6069 


3 


3054 


TRU3IPGSTISSRPRLCALAAEGHFLGHSWTGSRAGAHTGAPAW 
PSRRLRDLPAGGMWRLRRAAVACEVCQSLVKHSSGIKGSLPLQK 
LHLVSRS I YHSHHPTLKLQRPQLRTSFQQFS SLTNL PLRKLKFS 
P I KYGYQ PRRNFWPARLATRLL KLR YLI LGS AVGGG YTAKKTFD 
QWKDMIPDLSEYKWIVPDIWEIDEYIDFEKIRKALPSSEDLVK 
LAPDFDKIVESLSLLKDFFTSGSPEETAFRATDRGSESDKHFRK 
VS DKEK I DQLQ EE LLHTQLKYQR I LERLE KENKELRKLVLQKDD 
KG I P FI ESLRKSLIDMYS EVLDVLS D YDAS YNTQDHLPR WWG 
DQSAGKTS VLEMIAQARI FPRGSGEMMTRS P VKVTLSEGPHHVA 
LFKDSSREFDIiTKEEDLAALRHEIELRMRKNVKEGCTVSPETIS 
LNVKGPGLQRMVLVDLPGVINTVTSGMAPDTKETIFS I SKAYMQ 
DPNAIILCIQDGSVDAERSIVTDLVSQMDPHGRRTIFVLTKVDL 
AEKNVASPSRIQQIIEGKLFPMKALGYFAWTGKGNSSESIEAI 
REYEEEFFQNSKLLKTSMLKAHQVTTRNLSLAVSDCFWKMVRES 
VEQQADSFKATRFNLETEWKNNYPRLRELDRNELFEKAKNEILD 
E VI SLSQVTP KHWEE I LQQSLWERVSTHVI ENI YLPAAQTMNSG 
TFNTTVDIKLKQWTDKQLPNKAVEVAWETLQEEFSRFMTEPKGK 
EHDD I FDKLKEAVKEKS I KRHKWNDFAEDSLRVIQHNALEDRS I 
SDKQQWDAAIYFMEEALQARLKDTENAIENMVGPD\WKKRWLYW 
KNRTQEQCVHNETKNELEKMLKCNEEHPAYLASDEITTVRKNLE 
SRGVE VDPSL I KDTWHQVYRRHFLKTALNHCNLCRRGFYYYQRH 
FVD S E LECND WL FWR I QRMLA ITANTLRQQLTNTEVRRLE KNV 
KEVLEDFAEDGEKKIKLLTGKRVQIiAEDLKKVRE I QEKLDAFIE 
ALHQEK 


6090 


194 


1560 


PVFVPAPGAVLEQAS/AS PPLATQTWPLQHCKI PELPVQAS IL 
FELQLFFCQLIALFVHYINIYKTVWWYPPSHPPSHTSLNFHLID 
FNLIiMVTT I VLGRRF IG S I VKE ASQRGKVSL FRS I LLFLTR FT V 
LTATGWS LCRSL IHLFRT YS FLNLL/ FPLLS VWDVHS VPAAELR 
P \ RKTS LFNHMAS MGPRB AVSGLAKS RD YLLTLR \ RRGS S TQD S 
CMARTPCP/ PHACCLSPSL I RS EVE FLKMDFNWRMKE VLVS SML 
SAYYVAFVPVWFVKNTHYYDKRWSCELFLLVSISTSVILMQHLL 
PASYCDLLHKAAAHLGCWQKVDPALCSNVLQHPWTEECWWPQGV 
LVKHSKNVYKAVGHYNVAIPSDVSHFRFHFFFSKPLRILNILLL 
LEGAVI VYQL YS LMS SE KWHQT I SLAL I L FSN Y YAFFKLLRDRL 
VLGKAYSYSASPQRDLDHRFS 


6091 


3279 


412 


SSRTREMEEKEILRRQIRLLQGLIDDYKTLHGNAPAPGTPAASG' 

WQP PT YHSGRAFS AR YPRPS RRG YSSHHGP S WRKKYS LVNRP PG 

PSDPPADHAVRPLHGARGGQPPVPQQHVLERQVQLSQGQNWIK 

VKPPSKSGSASASGAQRGSLEEFEDTPWSDQRPREGEGEPPRGQ 

LQPSRPTRARGTCSVEDPLLVCQKEPGKPRMVKSVGSVGDSPRE 

PRRTVSESVIAVKASFPSSALPPRTGVALGRKLGSHSVASCAPQ 

LLGDRRVDAGHTDQP VPSGS VGG PARPASGPRQAREAS LWTCR 

TNKFRKNNYKWVAAS SKSPRVARRALS PRVAAENVCKASAGMAN 

KVEKPQLIADPEPKPRKPATSSKPGSAPSKYKWKASSPSASSSS 

SFRWQSEAGSKDHASQLSPVLSRSPSGD\RPAVGHSGLKPLSGE 

TPLSAYKVKSRTKIIRRRGSTSLPGDKKSGTSPAATAKSHLSLR 

RRQALRGKS S PVLKKT PNKGLVQVTTHRL CRLP PS RAHLPTKEA 

SSLHAVRTAPTSKVI KTRYR I VKKTPAS PLSAPPFPLSLPSWRA 

RRLSLSRSLVLNRLRPVASGGGKAQPGSPWWRSKGYRCIGGVLY 

KVSANKLSKTSGQPSDAGSRPLLRTGRLDPAGSCSRSLASRAVQ 

RS LAI IRQARQRREKRKE YCMYYNRFGRCNRGERCPYIHDPEKV 

AVCTRFVRGTCKKTDGTCPFSHHVSKEKMPVCS YFLKG I CSNSN 

CPYSHVYVSRKAEVCSDFLKGYCPLGAKCKKKHTLLCPDFARRG 

AC P RGAQ CQLLHRTQKRHS RRAATS P APGPS DATARS R VS AS HG 

PRKPSASQRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSSKAS 

SSSSSSSSPPASLDHEAPSliQEAAIiAAACSNRLCKLPSFISLQS 

S PSPGAQPRVRAPRAPLTKDSGKPLHIKPRL 


6092 


143 


3190 


AKAP PTGES S E P EAKVLHTKRLYRAWEAVHRLDIilLCNKTAYQ 
EVFKPENISLRNKLRELCVKZjMFLHPVDYGRKAEELLWRKVYYE 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VIQLIKTNKKHIHSRSTLECAYRTHLVAG1GFYQHLLLYIQSHY 
QLELQCC IDWTHVTDPL IGCKKP VSASGKEMDWAQMACHRCLVY 
LGDLSRYQNELAGVDTELLAERFYYQAIiSVAPQIGMPFNQLGTIi 
AGSKYYNVEAMYC YLRCI QSEVS FEGAYGNLKRLYDKAAKMYHQ 
LKKCETRKLSPGKKRCKDIKRLLVNFMYLQSLLQPKSSSVDSEIi 
TSLCQSVLEDFNLCLFYLPSSPNLSLASEDEEEYESGYAFLPDL 
LIFQMVI ICLMCVHSLERAGSKQYSAAIAFTLALFSHLVNHVNI 
RLQAELEEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPPPVT 
PQVGEGRKSRKFSRLSCLRRRRHPPKVGDDSDIiSEGFESDSSHD 

EEEGTRS PTLEPPRGRS EAPDS LNG PLGPS EAS I ASNLQAMSTQ 
MFQTKRCFRLAPTFSNLLLQPTTNPHTSASHRPCVNGDVDKPSE 
PASEEGSESEGSESSGRSCRNERSIQEKLQVLMAEGLLPAVKVF 
LDWLRTNPDLIIVCAQSSQSLWNRLSVLLNLIiPAAGELQESGLA 
LCPEVQDLLEGCELPDLPSSLLLPEDMALRNLPPLRAAHRRFNF 
DTDRPLLSTLEESWRICCIRSFGHFIARLQGSILQFNPEVGIF 
VSIAQSEQESLLQQAQAQFRMAQEEARRNRLMRDMAQLRLQLEV 

VIIPRTVIDGIiDLLKKEHPGARDGIRYLEAEFKKGNRYIRCQKE 
VGKSFERHKLKRQDADAWTLYKILDSCKQLT\LAQGAGEEDPSG 
MVTIITGLPLDNPSLLSGPMQAALQAAAHASVDIKNVLDFYKQW 
KEIG 


6093 


76 


1002 


ACGRPJ^ILALRVART/SRWGAL\RGAVWAPGTRPSKRRACWALL 
PPVPCCLGCLAERWRLRPAALGLRLPGIGQRNHCSGAGKAAPR\ 
PAAGAGAAAEAPGGQWGPASTPSLYENPWTI PNMLSMTRIGLAP 
Xrr .rt vr . T T PPD PWT ZXT ,f5VPA.T JVCtiTDLIiDGFT ARNWANORSALGS 
ALDPLADKI L IS I LYVSLTYADLI PVPLTYM I I SRDVMLI AAVF 
YVR YRTLPTPRTLAKYFNPCYATARLKPTFI S KVNTAVQL ILVA 
ASLAAPVFNYADS I YLQILWCFTAFTTAASAYS YYHYGRKTVQV 
IKD 


6094 


23 


1010 


PFLRCLRGDQKAKMSERKVLNKYYPPDFDPSKIPKLKLPKDRQY 
WRLMAP FNMRCKTCGE Y I YKGKKFNARKETVQNE VYLGLP I FR 
FYIKCTRCLAE I TFKTDPENTDYTMEHGATRNFQAEKLLEEEEK 
RVQKEREDEELNNPMKVLENRTKDSKLEMEVLENLQEIjKDIiNQR 
QAHVDFEAMLRQHRLSEEERRRQQQEEDEQETAALLEEARKRRL 
LEDSDSEDEAAPSPLQPALRPNPTAILDEAPKPKRKVEVWEQSV 
GSLGSRPPLSRLVWKKAKADPDCSNGQPQA/APHPRSPAEQEG 
GQPYTPDAWRVLPEPTGCIPGQ 


6095 


1 


1599 


TRGRAAERSRGRGHGFLGGGFA\SWDYFPSEDFYRCGYCKNES 
GSRSNGMWAHSMTVQDYQDLIDRGWRRSGKYVYKPVMNQTCCPQ 
YTIRCRPI^FQPSKSHKKVLKKMLKFLAKGEVPKGSCE\DEPMD 
STMDDAVAGDFALINKLDIQCDLKTLSDDIKESLESEGKNSKKE 
EPQELLQSQDFVGEKLGSGEPSHS 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine / R-Arginine, 
S=Serine, T=Threonine, V«=Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKVHTVPKPG KG ADLS KP PCRKAKE I RKER KRLKLMQQNP AGE L 
EG FQAQGH P PS LF P P KAKSNQPKS LEDL I FE S LPENAS HKLE VR 
WRSSPPSSQFKATLLESYQVYKRYQMVIHKNPPDTPTESQFTR 
FLCSSPLEAETPPNGPDCGYGSFHQQYWLDGKIIAVGVIDILPN 
CVSSVYLYYDPDYSFLSLGVYSALREIAFTRQLHEKTSQLSYYY 
MGFYIHSCPKMKYKGQYRPSDLLCPETYVWVPIEQCLPSLENSK 
YCRFNQDPEAVDEDRSTEPDRLQVFHKRAIMPYGVYKKQQKDPS 
EEAAVLQYASLVGQKCSERMLLFRN 


6096 


2277 


575 


QRVRAALLSSAMEDSEALGFEHMGLDPRLLQAVTDLGWSRPTLI ' '" 

QEKAI PLALEGKDLLARARTGSGKTAAYAI PMLQLLLHRKATGP 

WEQAVRGLVLVPTKELARQAQSMIQQLATYCARDVRVANVSAA 

EDSVSQRAVLMEKPDWVGTPSRILSHLQQDSLKLRDSLELLW 

DEADLL FS FG FEE EL KS LL CHLPR I YQAFLMS AT FNED VQALKE 

LILHNPVTLKLQESQLPGPDQLQQFQWCETEEDKFLLLYALLK 

LSLIRGKSLLFVNTLERSYRLRLFLEQFSIPTCVLNGELPLRSR 

CHIISQFNQGFYDCVIATDAEVLGAPVKGKRRGRGPKGDKASDP 

EAG VARGI D FHHVS AVLNFDLPPTP EAYIHRAGRTARANN PG I V 

LTFVLPTEQFHLGKIEBLLSGENRGPILLPYQFRMEEIEGFRYR 

CRDAMRSVTKQAIREARLKEIKEELLHSEKLKTYFEDNPR\DLQ 

LLRHDLPLHPAWKPHLGHVPDYLVPPAIiRGLVRPHKK\GRSCL 

PLVGRPREQSPRTHCAASSTKERNSDPQPSPPEWGPLWS 


6097 


1673 


192 


APGTMS GGKKKS S FQI TS VTTD YEG PGS PGASD PPTPQPPTGPP 
PRLPNGEPSPDPGGKGTPRNGSPPPGAPSSRFRWKLPHGLGEP 
YRRGRWTCVDVYERDLEPHSFGGLLEGIRG7ASGGAGGRSLDSRL 
ELAS LGLGAP TP P SGLSQGP TS WLR P P PTS PGPQARS FTGGLGQ 
LWPSKAKAEKPPLSASSPQQRPPEPETGESAGTSRAATPLPSL 
RVEAEAGGSGARTPPLSRRKAVDMRLRMELGAPEEMGQVPPLDS 
RPSSPALYFTODASLVHKSPDPFGAVAAQKFSLAHSMLAISGHL 
DSDDDSGSGSLVGIDNKIEQAMDLVKSHLMFAVREEVEVLKEQI 
RELAERNAALEQENGLLRALA\SPEQLGSAGPPRGVPR\LGPPA 
PNGP FVLS LPS LT I VPLGLPGLASAAWP PLPMPAL I VPVFPGVG 
VQALSNGPWSPGPLPHLLIIPSLDGGGEGFRTGRQQGAPFGEET 
QPPPSLPGTPQQ 


6098 


168 


1074 


NYCLRHRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
EGKIFKNWGTQTEKEDTSNINPRQTETSVNASRSPEKCAQQRQK 
RLNS AS QRSS SLP PSNRKS STPTKRE IMLTPVTVAYS PKRS PKE 
NLSPGFSHLLSKNESS PIRFDI LLDDLDTVPVSTLQRTNPRKQL 
\ QFLPLDDS E EK\ T YS E KAT \ DNI VNHS S C P E P VPNGVKKVS VR 
TAWEKNKSVSYEQCKPVSVTPQGNDFEYTAKIRTLAETERFF\D 
ELTKEKDQIEAALSRMPSPGGRITLQTRLNQEAFGRSFGKD 


6099 


168 


1074 


NYCLRHRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
EGK I FKNWGTQTBKEDTSN INPRQTETS VNASRS P EKCAQQRQ K 
RLNSAS QRS S SLP P S NRKS STP TKRE IMLTPVTVAYS PKRS PKE 
NLS PGFSHLLSKNES SPIRFDI LLDDLDTVPVSTLQRTNPRKQL 
\\je ufijuus>e l ti t i\\i iobaAI \UiMIVNHSSCPEPVPNGVKKVSVR 
TAWEKNKSVSYEQCKPVSVTPQGNDFEYTAKIRTLAETERFF\D 
E LTKEKDQI EAALSRMPS PGGR I TLQTRLNQEAFGRS FGKD 


6100 


2 


713 


F VE VS G YRS RADPE PRGR DTMT YA YLFKY 1 1 IGDTGVGKSCLLL 
Q FTDKRFQP VHDLT I GVE FGARM VN I DGKQ I KLQI WDTAG QES F 
RS I TRS Y YRGAAGALLVYDITRRETFNHLTS WLEDARQHS SSNM 
VIMLIGNKSDLESRRDVKREEGEAFARE\HGLIFMETSAKTACN 
VEEAFINTAKEIYRKIQQGLFDVHNEANGIKIGPQQSISTSVGP 
SASQRNSRDIGSNSGCC 


6101 


1 


1399 


FRGRAWPLREVSHWLGCRRVCSWSASWGW»PALS7\RLSPLLAFR 
GKMVFPLSCAVQQYAWGKMGSNSE VARLLAS SDPLAQ I AEDKP Y 
AELWMGTHPRGDAKILDNRISQKTLSQWIAENQDSLGSKVKDTF 
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(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R«Arginine, 
S=Serine / T^Threonine , V*=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NGNLPFLFKVLSVETPLSIQAHPNKELAEKLHLQAPQHYPDANH 
KPEMAI ALT P FQGLCG FRPVEE I VTFLKKVP EFQFL IGDE AATH 
LXQTMSHDS QAVASSLQSCFSHLMKS EKKVWEQLNLLVKR ISQ 
QAAAGNNMEDIPGELLLQLHQQYPGDIGCFAIYFLNLLTLKPGE 
AMFLEANVPHAYLKGDCVECMACSDNTVRAGLTPKFIDVPTLCE 
MLSYTPSSSKDRLFLPTRSQEDPYLSIYDPPVPDFTIMKA\EVP 
G\ S VTE YKDLALDSAS I LLMVQGTVIAS T PTTQTP I PLQRGGVL 
FIGANESVSLKLTEPKDLLIFRACCLL 


6102 


70 


2415 


QT P QATLAANGAEDS RGG EML PAGE IGAS P AAPCC S ES GDERKN 
LEEKSDINVTVLIGSKQVSEGTDNGDLPSYVSAFIEKEVGNDLK 
SLKKLDKLIEQRTVSKMQLEEQVLTISSEIPKRIRSALKNAEES 
KQFLNQFLEQETHLFSAINSHLLTAQPWMDDLGTMISQIEEIER 
H IiAYLKW ISQIEELSDNI QQ YLMTNNVP EAASTLVS MAE LD I KL 
QESSCTHLLGFMRATVKFWHKILKDKLTSDFEEILAQLHWPFIA 
PPQSQTVGLSRPASAPEIYSYLETLFCQLLKLQTSHELLTEPK\ 
HSQKNTLFLPPLLSS/WPIQVMLTPLQKRFRYHFRGNRQTNVLS 
KPEWYIiAQVLMWIGNHTEFLDBKIQPILDKVGSLVNARLEFSRG 
LMMLVLEKLATDIPCLLYDDNLFCHLVDEVLLFERELHSVHGYP 
GTFAS CMH I LS EETCFQRW LT VER KFALQ KMDS MLS S EAAWVS Q 
YKDITDVDEMKVPDCAETFMTLLLVITDRYKNLPTASRKLQFLE 
LQKDLVDDFR IRLTQVMKEETRASLGFR YCAILNAVNYI S TVLA 
DWADNVFFLQLQQAALEVFAENNTLSKLQLGQLASKESSVFDDM 
I NLLE RLKHDMLTRQ VDHVFRE VKD AAKL YKKERWL S L PS QSEQ 
AVMSLSSSACPLLLTLRDHLLQLEQQLCFSLEKIFWQMLVEKLD 
VYIYQEI ILANHFNEGGAAQLQFDMTRNLFPLFSHYCKRPENYF 
KH I KEACI VLNLNVG S ALTAGKDVLP VQLQGS F PAT 


6103 


207 


2523 


ESNSTMTT YLEFI QQNEERDGVR FS WNVW PS S RLEATRM WPVA 
ALFTPLKERPDLPPIQYEPVLCSRTTCRAVLNPLCQVDYRAKLW 
ACNFCYQRNQFPPSYAGISELNQPAELLPQFSSIEYWLRGPQM 
PLIFLYWDTCMEDEDLQALKESMQMSLSLLPPTALVGLITFGR 
MVQVHELGCEGISKSYVFRGTKDLSAKQLQEMLGLSKVPVTQAT 
RGPQVQQPPPSNRFLQPVQKIDMNLTDLLGELQRDPWPVPQGKR 
PLRSSGVALS IAVGLLE CTF PNTGAR I MMF IGGPATQG PGM WG 
DELKTP IRSWHD I DKDNAKYVKKGTKHFEALANRAATTGHVI DI 
YACALDQTGLLEMKCCPNLTGGYMVMGDSFNTSLFKQTFQRVFT 
KDMHGQFKMGFGGTLEIKTPR\EIKISGAIGPCVSLNSKGPCVS 
ENE I GTGGTCQ WKI CG LS PTTTLAI YFEWNQHNAP I PQGG \ RG 
A\ I Q FVTQ Y \ QHS SGQRR I R VTT I ARN \ WADAQTQ I QN I AAS FD 
QEAAAILMARLAIYRAETEEGPDVU2WLDRQLIRLCQKFGEYHK 
DDP SSFRFSETFS LYPQFMFHLRRS S FLQVFNNS PDESS Y YRHH 
FMRQDLTQSUMI QPI LYAYSFSGPPE PVLLDS SS ILADR ILLM 
DTFFQILIYHGETIAQWRKSGYQDMPEYENFRHLLQAPVDDAQE 
ILHSRFPMPRYIDTEHGGSQARFLLSKVNPSQTHNNMYAWGQES 
GAP I LTDDVSLQVFMDHLKKLAVSSAA 




124 


732 


KVSEYIILSKDKILFHALAMLVLWSPWSAARGVLRNYWERtiLk ' 
FCLPQSRPGFPS PPWGPALAVQ\AQPCLQSQQMI PVEVKRI /RSL 
LDS I F WMAAPKNRRTI EVNRCRRRNPQKLI KVKNNI DVCPECGH 
L KQKHVLCAYC YE KVCKETAE I RRQ IGKQ EGG P FKAPTIET WL 
YTGETPSEQDQGKRI IERDRKRPSWFTQN 


6105 


3 


989 


P LHGACTS LVLQRFCHRR PRP CAPARPE DMRR P AAVPLLLLLC F 
GSQRAKAATACGRPRMLNRMVGGQDTQEGEWPWQVS I QRNGSHF 
CGGSL I AEQWVLTAAHCFRNTS ETSLYQVLLGARQLVQPGPHAM 
YARVRQVESNPLYQGTASSADVALVELEAPVPFTNYI LPVCLPD 
PSVIFETGMNCWVTGWGSPSEEDLLPEPRILQKLAVPI IDT\ PR 
CNLLYSKDTEFGYQPKTIKNDMLCAGFEEGKKDACKGDSAGPLV 
CLVGQSWLQAGVISWGEGCARQNRPGVYIRVTAHHNWIHRI IPK 
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(A=Alanine, CoCysteine, b«Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
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P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine f 
W=Tryptophan, Y=Tyrosine, X=Unknown, +«Stop 

\=possible nucleotide insertion) 








LQVQPSEVGRPEVTPPGPGAP 


6106 


3 


1302 


GRP PTAPHTGR P PTANRGDPRLDLKRGCARLLTS IESRGRPAAS 
AGTjRRDR CAIiH P WPT.RRAPT iAR IiTP P P T^Cl Q DP P PZV DP DO n nonr* 

WSRARHQPGGLCLLLLLLCQFMEDRSAQAGNCWLRQAKNGRCQV 
LYKTELSKEECCSTGRLSTSWTEEDVNDNTLFKWMIFNGGAPNC 
I PCKETCENVDCGPG KKCRMNKKNKP RCVCAPD CSNI TWKG PVC 
GLDGKTYRNECALLKARCKEQPELEVQYQGRCKKTCRDVFCPGS 
STCV\ VDQTNNAYCVTCNRI CPEPAS SEQYLCGNDGVTYS \SAC 
HLRKATCLLGRS IGLAYEGKCI KAKSCEDIQCTGGKKCLWDFKV 
GRGRCSLCDELCPDSKSDEPVCASDNATYASECAMKEAACSSGV 
LLEVKHSGSCNSISEDTEEEEEDEDQDYSFPISSILEW 


6107 


623 


168 


SRCSS PRPEPGRGRGK/ LS PSEHRKWVEVFKACDEDHKGYLSRE 
DFKTAWMLFGYKPSKIEVDSVMSSINPNTSGILLEGFLNIVRK 
KKbAQRYKNlsVRrllr lAr DL X XKGFbTLEDFKKAFRQVAPKLPE 
RTVLEVFREV\DRDS\DGHVSF 


6108 


3 


1348 


GGSLRFSPPRVPS.CSRVFCPVPPGGCGLPSPMSASRPQSPTTPW 
CLPRRYMKHKRDDGPEKQEDEAVDVTPVMTCVFVVMCCSMLVLL 
Y YF YDLL VYW I G I FCLAS ATGL YS CLAP C VFJRLP \ SAS AGES A 
LI^TIPNNSLPYFHKRPQARMLLLALFCVAVSVVWGVFPJJEDQ 
WAWVLQDALXSIAFCLYMLKTIRLPTFKACTIiLLLVLFLYDIFFV 
FITPFLTKSGSSIMVEVATGPSDSATREKLPMVLKVPRLNSSPL 
ALCDRPFSLLGFGDILVPGLLVAYCHRFDIQVQSSRVYFVACTI 
A YG VGLLVT FVALALMQRGQ PALLYLVPCTLVTS CAVALWRRE L 
GVFWTGSGFAKVLPPSPWAPAPADGPQPPKDSATPLSPQPPSEE 
PATS P WPAEQS PKSRTS EEMGAGAPMREPGS PAES EGRDQAQPS 
PVTQPGASA 


6109 


1 


1381 


CRSRAGAASGGAILEGTKLRRQRVDTNKPLDPLVPSALRAAMLY 
LEDYLEMIEQLPMDLRDRFTEMREMDLQVQNAMDQLEQRVSEFF 
MNAKKNKPEWREEQMAS I KKDYY KALE DADE KVQLANQ I YDLVD 
RHLRKLDQELAKFKMELEADNAGITEILERRSLELDTPSQPVNN 
HHAHSHTPVEKRKYNPTSHHTTTDHIPEKKFKSEALLSTLTSDA 
S KENTLG CRNNNS TAS SNNAYNVNS S QPLGS YN I GS LSS GTGAG 
GI \TMAAAQAVQATAQMKEGRRTSSLKAS YEAFKNNDFQLGKEF 
SMARETVGYSSSSALMTTLTQNASSSAADSRSGRKSKNNNKSSS 
QQSSSSSSSSSLSSGSSSSTWQEISQQTTWPESDSNSQVDWT 
YDPNE PRYCICNQVS YGEM VGCDTQDCP IEWFHYGCVGLTEAPK 
GKWYCPQCT\AAMKRRGSRHK 


6110 


77 


2464 


ACPSAATMS DQDH SMDEMTAWKI EKGVGGNNGGNGNGGGAFSQ 
ARSSSTGSSSSTGGGGQESQPS PLALLAATCSRI ES PNENSNNS 
QGPSQSGGTGELDLTATQLSQGANGWQIISSSSGATPTSKEQSG 
S STNGSNGS ESS KNRTVSGGQ YVVAAAPNLQNQQ VLTGLPGVM P 
N IQYQVI PQFQTVDGQQLQFAATGAQVQQDGSGQ IQ 1 1 PGANQQ 
I ITNRGSGGNI IAAMPNLLQQAVPLQGLANNVLSGQTQYVTNVP 
VALNGN I TLLP VNS VS AATLTP S S QAVT I S SSGS QE SGSQ P VTS 
GTTISSASLVSSQASSSSFFTNANSYSTTTTTSNMGIMNFTTSG 
SSGTNSQGQTPQRVSGLOGSDALNIQQNQTSGGSLQAGOQKEGE 
Q\NQQTQAAPKSLSRPQLVQGG\QALQ\AFQAAPLSGQTFTTQA 
ISQETLQNLQLQAVPNSGPI I IRTPTVGPNGQVSWQTLQLQNLQ 
VQNPQAQTITLAPMQGVSLGQTSSSNTTLTPI ASAAS I PAGTVT 
VNAAQLS S M PGLQTI NLSALGTSG I Q VHP I QGLP LAI ANAPGDH 
GAQLGLHGAGGDGIHDDTAGGEEGENSPDAQPQAGRRTRREACT 
CPYCKDSEGRGSGDPGKKKQHICHIQGCGKVYGKTSHLRAHLRW 
HTGERPFMCTWSYCGKRFTRSDELQRHKRTHTGEKKFACPECPK 
RFMRSDHLS KH I KTHQNKKGGPGVALS VGTLPLDSGAGSEGSGT 
AT PSAL I TTNMVAMEAI C PEG I ARLANS G INVKEGGQFCS P INT 
SANGF 
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Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 


6111 


1637 


797 


RVDPRVRGAMAP WGKRLAG VRG VLLDI SGVLYDS GAGGGTAI AG 
SVEAVARLKRSRLKVRFCTNESQKSRAELVGQLQRLGFDISEQE 
VTAPAPAACQILKERGLRPYLLIHDGV\ASEFDQIDTS/STPNC 
WIADAGESFSYQNMNNAFQVLMELEKPVLISLGKGRYYKETSG 
LMLD VGP YMKALE YACG I KAEVGG KP S PE F FKS ALQA I GVEAHQ 
AVMIGDDIVGDVGGAQRCGMRALQVRTGKFRPSDEHHPEVKADG 
YVDNLAEAVDLLLQHADK 


6112 


77 


196 


MSSHKSFKSKRFIiAKKQKPNRPILQWIWLKTGNKIRHNWK 


6113 


1779 


567 


WEGRSWAACGVNLQGAWGERSGVRASEAESPGKRADVSWWSRQL 
E TMVDH LANTE I NS QR I AAVES CFGAS GQPLALPGR VLLGE G VL 
TKECRKKAKPRIFFLFNDILVYGSIVT.NTOJKYR QfJHTTDT.PPOT 
LELLPETLQAKNR WM I KTAKKS F WSAASATERQEWI SH IEECV 
RRQLRATGRPA\ S TEPIAAP W I PDKATD I CMR CTQTRFS ALTRRH 
HCRKCRWVCAECSRQRFLLPRLSPKPVRVCSIiCYRELAAQQRK 
EEAEEQGAGVPRAASHIiARP I CGRPVEMTMTPTRTRRAAGTATG 
PAAWS STPRGWPGLPSTADPR PAEHLS PSQLHC PG PQEGSSRS C 
PGLRDPIPWWQVQRWGVALSGLPVPFCWTLCPYGFTAGNAFPFR 
KPQNTHRSW 


6114 


818 


246 


PTSRPRPSPGSPAMSWSACVSAAPSSSWPASSSWPCGPRRCCTR 
RRRCS PRCGLAAGS M CSCS PS WRCT P VPACWPS P P P \PAEQ VQC 
GHL P PHADRRALRL P VAAP ARG PG PGHPAGPAG PR PARTP PAS P 
HGPGRPTVPAPPCPLLAATEPTPSRPHQRWTREDRMLGRGSQVT 
GRPQWFLRGLVLFSL 


6115 


324 


71 


D VCGR VCAHPHL YTH I HMH I C AHAC \ I HTHAQLC ^ ITASHALAH 
SHLYTCMVMLTASHTPSHTHPHTAVHKEHRADVLRGTLTPLR 


6116 


595 


1430 


TGVMPPGRWHAA/ISSSGPVFEGARA\LQTVKKEEEDESYTPVQ 
AARPQTLNRPGQELFRQLFRQLRYHESSGPLETLSRLRELCRWW 
LRPDVLSKAQILELLVLEQFLSILPGELRVWVQLHNPESGEE\L 
WPCWRS CRGTLMGHPGGTRAL P \ EPRCALDG YRS \ LRS AQ I WS L 
AS PLRS SSALGDHLE P P YE I EARDFLftGO tiDTPflanMD aT .t?t>» tt 

GCPGDQVTPTRSLTAQLQETMTFKDVEVTFSQDEWGWLDSAQRN 
LYRDVMLENYRNMASLGK 


6117 


1433 


222 


VG VPS PAP P CS WE VG PGGG WT PG I LKEG QGGRRTPLLLLATRTR 
GLLSL F P PAAMHPAA FP LP VWAAVLWGAAPTRGLI RATSDHNA 
SMDFADLPALFGATLSQEGLQGFLVEAHPDNACSPIAPPPPAPV 
NGSVF IALLRRFDCNFDLKVLNAQKAG YGAAWHNVNSNELLNM 
VWNSEE IQOQI WI PS VFIGERS SE YLRALFVYEKGARVLLVPDN 
TFPLGYYLI PFTGI VGLLVLAMGAVMIARCIQHRKRLQRNRLTK 
\EQLKQI \ PTHDYQKGDQYDVCAI CLDEYEDGDKLRVLPCAHAY 
HSRCVDPWLTQTRKTCPICKQPVHRGPGDEDQEEETQGQEEGDE 
GEPRDHPASERTPLLGSSPTLPTSFGSLAPAPLVFPGPSTDPPL 
SPPSSPVILV 


6118 


1044 


247 


STISCRACTSGATPGAQSHRSARGHAAGGKETAALGMERGKVKK 
KEKEKETQKEKIGEKGREEKVKRKEVEQKIKQ3KQEKQERRKGK 
E KEE KRTKQGKETNKE KE Q FKGQEE KGENKDS TLTRT P LE PLEK 
NKQ I LVLGLDGAGKTS VLHSLASNRVQHS VAPTQGFHAVC INTE 
DSQME FLE IGGS KPFRS YWEMYLSN/ ADS LARS FS VGFKQDS QP 
IT W KAKKYLHQL I AANP VL PL WFANKQDLEAAYH ITD IHEALA 
II 


6119 


1217 


462 


DPRF VTENTTKAP AQERTTQPRS S REGTLRSTME YLS ALN P S DL " 

LRS VSN I S S E FGRR VWTS AP P PQRP FR VCDHKRT I RKGLTAATR 

QELLAKALETLLLNGVLTLVLEEDGTAVDSEDFFQLLEDDTCLM 

VLQSGQSWSPTRSGVLSYGLGRERPKHSKDIARFTFDVYKQNPR 

DLFGSLNVKATFYGLYSMSCDF<JGL\GPKKVIiRELLRWTSTLLQ 

GLGHMLLGISSTLRHAVEGAEQWQQKGRLHSY 
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sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KsLysine, 
L=Leucine, M=Methionine, N=Aeparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
^Tryptophan, Y= Tyro sine, X=Unknovm, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6120 


785 


179 


LE RAGGG GLS SRAL VGSGACLS LVARANG KGLPRGRKE FVEAVR 
VRYVAFRYRTPRAVCLRLWSCRREVIMSGRGKQGGKVRAKAKSR 
SSRAGLQFPVGRVHRLLRKGNYAERVGAGAPVYLAAVLEYLTAE 
I LELAGNAARDNKKTR 1 1 PRHLQLAI RNDEE LNKLLGKVT I AQG 
G\VLPNIQAVLLPKKTESQKDEGANDP 


6121 


1612 


107 


FVRAQARGSRQPVRRPLLGAGSRLRCRSCGRMEPLKVEKFATAN 
RGNGLRAVTPLRPGELLFRSDPLArrVCKGSRGWCDRCLLGKE 
KLMRCSQ CR VAKY CS AKCQKKAW P DHKRE C KCLKS CKPR YP PDS 
VRLLGRWFKLMDGAPSESEKLYSFYDLESNINKLTEDKKEGLR 
QLVMTFQHFMREEIQDASQLPPAFDLFEAFAKVICNSFTICNAE 
MQEVGVGLYPSISLLNHSCDPNCSIVFNGPHLLLRAVRDIEVGE 
E LT I CYLDMLMTS E ERR KQLRDQ YC FE CD \ CFRCQTQDKDADML 
TGDEQVWKEVQESLKKIEELKAHWKWEQVLAMCQAIISSNSERL 
PD I N I YQL KVLDCAMDAC I NLGLLE EAL FYGTRTME PYRIFFPG 
SHPVRGVQVMKVGKLQLHQGMFPQAMKNLRLAFDIMRVTHGREH 
S L I E DLI L LL E / AMRRQHQS I LRERSQRE I RR VSLLNALL RSHT 
LCFVS CVNLS YWKFCS VFV 


6122 


2 


2324 


RFRKMADGGAASQDESSAAAAAAADS RMNNP S ETSKPSMES GDG 
NTG TQTNGLD FQKQP VP VGGAI S TAQAQAFLGHLHQ VQLAGTSL 
QAAAQSLNVQSKSNEESGDSQQPSQPSQQPSVQAAIPQTQLMLA 
GGQ I TGLTLTPAQQQLLLQQAQAQAQIiLAAAVQQHSASQQHSAA 
GAT I S AS AAT PMTQ I PLS Q P IQ I AQDLQQLQQLQQQNLNLQQFV 
LVHPTTNLQPA\QFIISQTPQGQQGLLQA\QNLLTQLPRQSQAN 
LLQSQPRI\TLTSQPATPTCTIAATPIQTLPQSQSTPKRIDTPS 
LEEP\SDLEELEQFAKTFKQRRIKLGFT\QGDAGLAMVKLYGND 
FSPTTIFRFEALNLSFKNMCKLKPLLEKWLNDAENLSSDSSLSS 
PSALNSPGIEGLSRRRKKRTSIEA\NIRVALEKSFLEN\QKPTS 
EEITMIADQLNMEKGVIRVWFCNRRQKEKRINPPSSGG\TSSSP 
IKAIFPSPTSIiVATTPSLVTSSAATTLTVSPVLPLTSAAVTNLS 
VTGTSDTTSNNTATVISTAPPASSAVTSPSLSPSPSASASTSEA 
SSAS ETSTTQTTS TPLS S PLGTSQVMVTASGLQTA/AQLLPFKG 
AAQLPANASLAAMAAAAGLNPSLMAPSQFAAGGALLSLNPGTLS 
GALS PALMSNS TLATIQALASGGS L P I TS LDATGNLVFANAGGA 
PNIVTAPLFLNPQNLSLLTSNPVSLVSAAAASAGNSAPVASLHA 
TSTSAES I QNS L FTVASAS GAAS TTTTAS KAQ 


6123 


3 


2944 


HLLHRWFGTDMQMINFTTGEFQLTEACPYLGTHSEESRFGILHL - 
HLQPLEMKRVGWFTPADYGJCVTSLILIRNNLTVIDMIGVEGFG 
ARELLKVGGRLPGAGGSLRFKVPESTLMDCRRQLKDSKQILSIT 
KNFKVENIGPLPITVSSLKINGYNCQGYGFEVLDCHQFSLDPNT 
SRDISIVFTPDFTSSWVIRDLSLVTAADLEFRFTLNVTLPHHLL 
PLCADWPGPS WEES FWRLTVFFVSLS LLGVIL I AFQQAQ YI LM 
E FMKTRQRQNASS S SQQNNGPMDVI S PHS YKSNCKNFLDTYGPS 
DKGRGKNCLPVNTPQSRIQNAAKRSPATYGHSQKKHKCSVYYSK 
HKTSTAAAS S TS TTTE E KQTS P LGSS L PAAKED I CTDAMRENW I 
S LR YAS G I NVNLQ KNLTLPKN LLNKEENTLKNT I VFSNPSS E CS 
MKEGI QTCMFPKBTD I KTS ENTAE FKERELC PLKTS KKLPENHL 
PRNSPQYHQPDLPE ISRKNNGNNQQVPVKNEVDHCENLKKVDTK 
P S S EKK I HKTS REDM FS EKQD I P FVEQEDP YRKKKLQE KREGNL 
QNLNWSKSRTCRKNKKRGVAPVSRPPEQSDLKLVCSDFERSELS 
SDINVRSWCIQESTREVCKADAEIASSLPAAQREAEGYYQKPEK 
KCVDKFCSDSSSDCGSSSGSVRASRGSWGSWSSTSSSDGDKKPM 
VDAQHFLPAGDS VS QNDFPSEAP ISLNLSHNI CNPMTGNSLPQY 
AEPSCPSLPAGPTGVEEDKGLYSPGDLWPTPPVCVTSSLNCTLE 
NGVPCVIQESAPVHNSFIDWSATCEGQFSSAYCPLELNDYNAFP 
EENMNYANGFPCPADVQTDFI DHNSQSTWNTP P\NMPAS \ WGNA 
QFPSSSRPYLKSTPKACLPMSGLFGPI\WAP\QSDVYENCCPIN 
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ID 

NO: 


ricUlCLcQ 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide' 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=*Leucine, M=Methionine, N=Asparagine / 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
"^Tryptophan, Y=Tyroeine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








PTTEHSD/ THMENQA\ WCKE YYPGF \NPFRAYMNLDIWTTT\A~ 
NRNANFPLS RDSS YCGNV 


6124 


! 1573 


236 


S DEALRLAGERGMGRVQLFE I S LSHGRWYSPGE PLAGTVRVRL 
GAPL P FRA I RVTC I GS CG VS NKANDTAWWEEG Y FNS SLS LADK 
GSLPAGEHSFPFQFLLPATAPTSFEGPFGKIVHQVRAAIHTPRF 
SKDHKCSLVFYILSPLNLNSIPDIEQPNVASATKKFSYKLVKTG 
S WLTASTD LRG Y WGQALQ LHADVENQ S G KDTS P WAS LLQ KV 
SYKAKRWIHDVRTIAEVEGAGVKAWRRAQWHEQILVPALPQSAL 
PGCS L I H I D Y YLQ VS L KAPEATVTL P VF I GNI AV / N P CPSE P P A 
RPGAASWGPTPGG\PSAPPQEEAEAEAAAGGPHFLDPVFLSTKS 
HSQRQPLLATLSS VPGAPEPCPQDGS PASHPLHPPLC I STCATV 
PYFAEGSGGPVPTTSTLILPPEY3SWGYPYEAPPSYEQSCGGVE 
PSLTPES 


6125 


1 


904 


KTC P KLTCAFTVS VP D S CCR VCRGDGELS WEHSDGD I FRQP ANR 
EARHSYHRSHYDPPPSRQAGGLSRFPGARSHRGALMDSQQASGT 
IVQIVINNKHKHGQVCVSNGKTYSHGESWHPNLRAFGIVECVLC 
TCNVTKQECKKIHCPNRYPCKYPQKIDGKCCKVCPG/KKAKEEL 
PGQSFDNKGYFCX3EETMPVYESVFMEDGETTRKIALETERPPQV 
EVHVWTIRKGILQHFHIEKISKRMFEELPHFKLVTRTTLSQWKI 
FTEGEAQISQMCSSRVCRTELEDLVKVLYLERSEKGHC 


6126 


1224 


389 


RLLSEAPCPRS RRRFQMNPEWGQAFVHVAVAGGLCAVAVFTG I F 
DSVSVQVGYEHYAEAPVAGLPAFIiAMPFNSLVNMAYTLLGLSWL 
HRGGAMGLGPRYLKDVFAAMALLYGPVQWLRLWTQWRRAAVLDQ 
WLTLP IFAWP VAWCLYLDRGWRP \ WLFLS LE C VS LAS YGLALLH 
PQG FEVALGAHWPAVGQALRT \HRHYG/ SATPS ATYLALGVLS 
CLGFWLKLCDHQLARWRLFQCLTGHFWSKVCDVLQFHFAFLFL 
THFNTHPR FHP S GG KTR 


6127 


1335 


463 


VLPRRCLVF WNTMDS S REPTLGRLDAAG F WQVWQRFDADEKGY 
IEEKELDAFFLHMLMKLGTDDTVMKANLHKVKQQFMTTQDASKD 
GR I RMKELAGMFLS EDEN FLLL FRRENPLDS S VE FMQ I W RK YDA 
DS SGF I S AAELRNFLRDLFLHHKKAI S EAKLEE YTGTMM K I FDR 
NKDGRLDLNDLARILALQENFLLQFKMDACSTEKRKGDFEKI FA 
YYD VS KTG ALEG P \ E VDGF VKDMMEL VQPS I SG VDL DKFRE I LIj 
RHCDVNKDG KI QKS ELALCLGL KINP 


6128 


2511 


843 


TCRMSRRQLERWVWSSQQVQARGRNVRAPRLGKIAMGLEMSSKD 
SPGSLDGRAWEDAQKPQSAWCGGRKTRVYATSSRRAPPSEGTRR 
GGAARPEKTAEEGPPAAPGSLRHSGPLGPHACPTALPEPQVTSA 
MSSQWGIEPLYIKAEPASPDSPKGSSETETEPPVALAPG\PAP 
TRCLPGHKEEEDGEGAGPGEQGGGKLVLSSLPKRLCLVCGDVAS 
G YHYG VAS CEACKAFFKRT I QGS I E YS C PASKE CE I TKRRRKAC 
QACRFTKCLR VG ML KEG VRLDR VRGGRQKYKRR PE VD PL P FPG P 
FPAGPLAVAGGPRKTAAPVNALVSHLLWEPEKLYAMPDPAGPD 
GHLPAVATLCDLFDREIVVTISWAKS I PGFSSLSLSDQMSVLQS 
VWMEVLVLGVAQRS LTLQDELAFAE YLVLDEEGARPAGLGELG \ 
AALLOLVRRljOAIjRLEREEYVTJ.K'ftT.JiT.aMcinQVUTPnn'D'DT wo 
S CE KLLHEALLE YE AGRAG PGGGAERRRAGRLLLTL PL LRQTAG 
KVLAHFYGVKLEGKVPMHKLFLEMLEAMMD 


6129 


1764 


771 


ARFARSAHEGKMPKKKTGARKKAENRREREKQLRASRSTIDLAK 
HPCNASMECDKCQRRQKNRAFCYFCNSVQKLPICAQCGKTKCMM 
KSSDCVIKHAGVYSTGLAMVGAI CDFCEAWVCHGRKCLSTHACA 
CPLTDAEC\VECERGVWDHGGRIFSCSFCHNFLCEDDQFEHQAS 
CQVLEAETFKCVS CNRLGQHS CLRCKACFCDDHTRS KVFKQEKG 
KQ P P CP KCGHETQE TKDLSMS TRS LKFGRQTGGEEGDGAS GYDA 
YWKNLSSDKYGDTSYHDEEEDEYEAEDDEEEEDEGRKDSDTESS 
DLFTNLNLGRTYASGYAHYEEQEN 


6130 


3 


577 


GRGGTMREYKVWLGSG\GVGKSALTV\QFVTCTFIEKYDPTIE 



451 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 
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beginning 
nucleotide 
location , 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=* 
L»j.ucamic Acia, r ^Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *~Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








D F YRKE I E V \DS S PS VAG I S WTQQGTEQF \ASMRDL Y I KKGQGC 
ILVYSLVNQQSFQ\DIKPMRDQIIRVKVSEKVPVI\LVGN\SVD 
LESEREVSSSEGRALAEEWGCPFMETSAXSKTMVDELFAEIVRQ 
MNYAAQPDKDDPCCSACNIQ 


6i3a 


3 


1811 


SSPREKTSDSSHRPSRHGFLFLRLVGLSPFSYLCVPPSRPVPGS 
PRSLSAMRLLPLAPGRLRRGS PRHLP S CS PALLLLVLGGCLG VF 
GVAAGTRRPNWLLLTDDQDEVLGGMTPLKKTKALIGEMGMTFS 
SAYVPSALCCPSRASILTGKYPHNHHWNNTLEGNCSSKSWQKI 
QE PNTFPAI LRSMCGYQTFF\AGKYLNE YGAPDAGGLEHVPLGW 
S YWYALE KNS KYYNYTLS ING KARKHGENYS VD YLTD VLANVS L 
DFLDYKSNFEPFFMMTATP \APHS PWTAAPQYQKAFQNVFAPRN 
KW FN IHGTNKHWL I RQAKTPMTNS S IQ FL DNAFRKRWQTLLS VD 
DLVE KLVKRLE FTGE LNNT Y I F YTS DNG YHTGQ FS L P I D KRQL Y 
EFDIKVPLLVRGPGIKPNQTSKMLVANIDLGPTILDIAGYDLNK 
TQM DGMS LL P I LRG ASNLTWRS DVLVE YQGEGRNVTD PTC PSLS 
PGVS QCF P D CVOSDAYNNTYACVRTMS ALWNLQ YCE FDDQE VFV 
EVYNLTAD PDQ ITNI AKTIDPE LLG KMN YRLMMLQS CSG PTCRT 
PGVFDPG YRFD PRLM FSNRGS VRTRRFS KHLL 


6132 


96 


1241 


AAGLLPPGLVPEDPRRTRNLLP FG I QGP P FALS R PLF S CVE SGW 
AWEAME PEFL YDLLQL PKGVE P PAEEELS KGGKKKYLPPTSRKD 
PKFEELQKPA\VLMEWINATLLPEHIWRSLEEDMFDGLILHHL 
FQRLAALKLEAEDIALTATSQKHKLTWLEAVNRS \CS WRSGRP 
SGA/WESIFNKDLLSTLHLLVALAKRFQPDLSLPTNVQVEVITI 
ESTKSGLKSEKLVEQLTEYSTDKDEPPFCDVFDELFKLAPEKVNA 
VKEAIVNFVNQKLDRLGLSVQNLDTQFADGVILLLLIGQLEGFF 
LHLKEFYLTPNSPAEMLHNVTLALELL/ IGRGPAQLPC /LALK/ 
TIVNKDAKSTLRVLYGLFCKHTQKAHRDRTPHGAPN 


6133 


2 


4256 


FVHGSMADTDLFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 
TTVS VS QQ PVSAP VP IAAHAS VAGHLS TS TT VS S SGAQNS D S T K 
KTLVTL I ANNNAGNP LVQQGGQ PL I LTQNPAPGLGTMVTQP VLR 
P VQVMQNANHVTSS P VAS Q P I F ITTQGFP VRNVRPVQNAMNQVG 
I VLNVQQGQTVRP I TLVPAPGTQFVKPTVGVPQVFS QMTPVRPG 
STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTSLGQIAVQSPGQSNQTTI^PKIjAPSFPSPPAVSIASFVT 
VKRPG VTGENSNE VAKLVNTLNTI PS LGQS PGPVWSNNS S AH \ 
GSQRTSGPESSMKVTSS1PVFDLQDGGRKICPRCNAQFRVTEAL 
RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPSPEKTAPVAS 
/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 
GRDGGKVAQLTNFPKVATS FRCPHCTKRLKNNIRFMNHMKRHVE 
LDQQNG E VDGHT I CQHCYRQ FS T P FQLQ CHLENVHS P YES TTKC 
KICEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 
VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
IQKRAVRKMSVMGROTCLECSFRTPnPPNWTTPTVVurcT pdvot 

CCSRAYANHMINNHVPRKSPKYLALFKNSVSGIKLACTSCTFVT 
SVGDAMAKHLVFNPSHRSSS I L PRGLTW I AHSRHGQTRDR VHDR 
NVKNMY P P P S FPTNKAATVKS AGATPAE PEELLTPLAPALPS PA 
STATPPPTPTHPQALALPPLATEGAECLNVDDQDEGSPVTQEPE 
LASGGGGSGGVGKKEQLSVKKLRWLFALCCNTEQAAEHFRNPQ 
RRIRRWLRRFQASQGENLEGKYLSFEAEEKLAEWVLTQREQQLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTIiPKDVAENAGLF I DFVQRQ I HNQDLPLSM I VA IDEI S LFL 
DTEVLSSDDRKENALQTVGTGEPWCDWLAILADGTVLPTLVFY 
RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 
RS KGM L VMDCHRTHLS E E VLAMLS AS STLPAWPAGCS S K I Q PL 
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ID 

WO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 








DVCI KRTVKNFLHKKWKEQAREMADTACDSDVLLQLVLWLGEV 
LG VI GDC P EL VQRS PL VAS VL PG PDGNINS PTRNADMQEE L IAS 

LEEQLKLSGEHSESSTPRPRSSPEETIEPBSLHQLFEGESETES 
FYGFEEADLDLMEI 


6134 


2 


4256 


FVHGSMADTDLFMECEEEELEPWQKISDVIBDSWEDYNSVDKT 
TTVSVSQQPVSAPVPIAAHASVAGHLSTSTTVSSSGAQNSDSTK 
KTLVTL I ANNNAGN P L VQQGGQP L I LTQNPAPGLGTMVTQ P VLR 
P VQVMQNANHVTS S PVASQP I F I TTQGFPVRNVRP VQNAMNQVG 
I VLNVQQGQTVRP I TL VPAPGTQ FVKP TVG VP Q VFSQMTP VRPG 
S TMP VR PTTNT FTTV I PATLT IRS TVP QS QS Q Q TKS TP STS TTP 
TATQ PTSLGQLAVQS PG QS NQTTN P KLAPS FP S PPAVS IAS FVT 
VKR PG VTGENS NE VAKL VNTLNT IPS LGQS PG P WVSNNS S AH\ 
GSQRTSGPESSMKVTSS I PVFDLQDGGRKICPRCNAQFRVTEAL 
RGKMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPSPEKTAPVAS 
/ TH P S S TP I PALS PP Y / TKVPEPNE NVGDAVQT KL I MLVDDF Y Y 
GRDGGKVAQLTNFPKVATS FRCPHCTKRLKNNIRFMNHMKHHVE 
LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 
KICEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 
VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKP KQLEGLKPGTKVT I RA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
IQKRAVRKMS VMGRQTCLECS FE I PDFPNHFPTYVHCSLCRYST 
CCSRAYANHM INNHVPRKS PKYLALFKNS VSG I KLACTSCTF VT 
S VGD AMAKHLV FNP S HRS SSI LPRGLTW I AHS RHGQTRDR VHDR 
NVKNM YP P PS FP TNKAATVKSAGAT PAE P EELLTP LAP AL PS PA 
STATP PPTPTHPQALALPPLATEGAECLNVDDQDEGS PVTQEPE 
LASGGGGSGGVGKKEQLSVKKLRVVLFALCCNTEQAAEHFRNPQ 
RRIRRWLRRFQASQGENLEGKYLSFEAEEKLAEWVLTQREQQLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 
DTEVLS SDDRKENALQTVGTGEPWCDWLAI LADGTVLPTLVFY 
RGQMDQPANMPDS ILLEAKESGYSDDE IMELWSTRVWQKHTACQ 
RSKGMLVMDCHRTHLSEEVLAMLSASSTLPAWPAGCSSKIQPL 
DVCI KRTVKNFLHKKWKEQ ARE MADTACDSDVLL'QLVLVWLGEV 
LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLMEI 


6135 


2 


4256 


FVHGSMADTDLFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 
TTVS VSQQP VSAPVP I AAHAS VAGHLSTSTTVS SSGAQNSDSTK 
KTLVTL I ANNNAGN PL VQQGGQ PL I LTQNPAPGLGTMVTQP VLR 
P VQVMQNANHVTS S PVASQP I FI TTQG FPVRNVRP VQNAMNQVG 
I VLNVQQGQTVRP I TLVPAPGTQFVKPTVGVPQ VFSQMTP VRPG 
STMPVRPTTNTFTTVI PATLTIRS TVVQSQSQQTKSTPS TSTTP 
TATQPTS LGQLAVQS PGQSNQTTNPKLAPS FPS PPAVS IAS FVT 
VKRPGVTGENSNEVAKLVNTLNT I PSLGQS PGP VWSNNSSAH\ 
GSQRTS G PES SMKVTS S I PVFD LQDGGRK I C PRCNAQ FR VTEAL 
RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPSPEKTAPVAS 
/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 
GRDGGKVAQLTNFPKVATS FRCPHCTKRLKNNIRFMNHMKHHVE 
LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 
K I CE WAFE S E P LFLQHMKDTHKPGEMP YVCQVCQ YRS S LYS EVD 
VHFRM IHEDTRHLL CP YC LKVFKNGNAFQQHYMRHQKR \NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKP KQLEGLKPGTKVT IRA 
S RGQPRT V P VS SNDTP PS ALQEAAPLTSSMDP L PVFLYP P VQRS 
IQKRAVRKMSVMGRQTCLECSFEIPDFPNHFPTYVHCSLCRYST 
CCSRAYANHMINNHVPRKSPKYIJUjFKNSVSGIKLACTSCTFVT 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L~Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosina, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








S VGDAMAKHLVFNPSHRSSS I LPRGLTWIAHSRHGQTRDRVHDR 
NVKNMYPPPSPPTNKAATVK^AGrATPfiPPPFr.T.TPT.apaT.DOTia 

STATPPPTPTHPQALALPPLATEGAECLNVDDQDEGSPVTQEPE 
LASGGGGSGGVGKKEQLSVKKLRWLFALCCNTEQAAEHFRNPQ 
RR I RRWLRRFQAS QGENLEG K YLS FEAE EKLAE WVLTQREQQL P 
VNE ETL FQKATK I GRS LEGG FKI S YEWAVRFMLRHHLT PHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 
DTEVLS S DDRKENALQTVGTGE PWCDWLAI LADGTVLPTLVFY 
RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 
R S KGMLVMDCHRTHLS E E VLAMLS AS S TLP A WPAGC S S KIQPL 

LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLMEI 


6136 


1704 


539 


FGVRMALEGMSKRKRKRSVQEGENPDDGVRGSPPEDYRLGQVAS 
SLFRGEHHSRGGTX3RLASLFSSLEPQIQPVYVPVPK\ESAXiASA 
DLEEEIHQKQGQKRKNSQPGVKVADRKILDDTEDTWSQRKKIQ 
INQEEERLKNERTVFVGNLPVTCNKKKLKSFFKEYGQIESVRFR 
SL I PAEGTLS KKLAAI KRKIHPDQKNINAYWFKEESAATQALK 
RNGAQ I ADGFRIRVDLASETSSRDKRS VFVGNLP YKVE ES AI EK 
HFLDCGSIMAVRIVRDKMTGIGKGFGYVLFENTDSVHLALKLEQN 
SELMGRKLRVMRSVNKEKFKQQNSNPRLKNVSKPKQGLNFTSKT 
AEGHP KSL F I GE KAVLLKTKKKGQ KKS GRP KKQRKQK 


6137 


141 


2656 


RALRKRRCGPGRRGALGSGPGPQRRPGRVPEERPAPPRERKHPG 
MWNML I VAM CLA\ LLGL PG KAQELQGH VS \ 1 1 IiAGEQLGD LAKK 
YLWQG\LFQLYLDEAGRGHSFSFHGAALTAPKQGQELMAKALES 
LSCPKDMAPSHCAEHKDQFLQLSQYRQLKTAEDYQALNKDIEAQ 
LQHAGLREAGGIFYFSVPPFAYEDIARNINSSCRPGPGAWLRW 
LEKPFGHDHFSAQQLATELGTFFQEEEMYRVDHYLGKQAVAQIIi 
PFRDQNRKALDGLWNRHHVERVEI 1 MKE TVDAEGRTS FYE B YGV 

QRGS AWGQ YQ S YSE QVRRELQ KPDS FHSLTPTFAGVLVH I DNL 
RWEGVP FI LMS GKALDER VG YAR I L FKNQACC VQS EKHWAAAQS 
QCLPRQLVFHIGHGDLGSPAVLVSRNLFRPSLPSSWKEMEGPPG 
LRLFGS PLSDYYAYS PVRERDAHSVLLSHI FHGRKNFFITTENL 
LASWNFWTPLLESLAHKAPRLYPGGAENGRLLDFEFSSGRLFFS 
QQQPEQLVPGPGPGPMPSDFQVLRAKYRESSLVSAWSEELISKL 
ANDIEATAVRAVRRFGQFHLALSGGSSPVALFQQLATAHYGFPW 
AHTHLWLVDER C VPLS DPESNFQGLQAHLLQHVR I P YYNIH \ AM 
P VHLQ QRLCAE EDQGAH I YAR E I S ALGANSS FDLVLLGMGADGH 
TASLFPQSPTGLDGEQLWLTTSPSQPHRRMSLSLPLINRAKKV 
AVLVMGRMKREITTLVSRVGHEPKKWPISGVLPHSGQLVWYMDY 
DAFLG 


6138 


4587 


934 


EFSKLTDRWQNAVQGVRQRKGDVDGLVRQWQDFTTSVENLFRFL 
TDrSHLLSAVKGQERFSLYQTRSLIPIELKWKEIHFQRRRTTCAL 
TLEAGEKLLLTTDIiKTKESVGRRISQLQDSWKDMEPQLAEMIKQ 
FQSTVETWDQCEKKIKELKSRLQVLKAQSEDPLPELHEDLHNEK 
ELIKELEQSLASWTQNLICELQTMKADLTRHVLVEDVMVLKEQIE 
HLHRQWEDLCLRVAIRKQEIEDRLNTWWFNEKNKELCAWLVQM 
ENKVLQTADISIEEMIEKLQKDCMEEINLFSENKLQLKQMGDQL 
IKASNKSRAAEIDDKLNKINDRWQHLFDVIGSRVKKLKETFAFI 
QQLDKNMSNLRTWLARIESELSKPVVYDVCDDQEIQKRLAEQQD 
LQRD IEQHSAGVESVFNICDVLLHDSDACANETECDS IQQTTRS 
LDRRWRNICAMSMERRMKIEETWRLWQKFLDDYSRFEDWLKSAE 
RTAACPNS S EVLYTSAKEE LKR FEAFQRQ I HERLTQLEL INKQ Y 
RRLARENRTDTASRL KQMVHEGNQR wdnlqrrvtavlrrlrhft 
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P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NQREEFEGTRESILWLTEMDLQLTNVEHFSESDADDKMRQLNG 
FQQEITLNTNKIDQLrVFGEQLlQKSEP\LDAVLIEDELEBLHR 
YCQEVFGRVSRFHRRLTSCTPGLEDE KEASENETDMEDPREI QT 
DSWRKRGESEEPSSPQSLCHLVAPGHERSGCETPVSVDS\IPLE 
WDHTGRRGGPSSSH\EEDEEAQYY\SALSGKSISDGHSWHVPDS 
PSCPEHHYKQMEGDRNVPPVPPASSTPYKPPYGKLLLPPGTDGG 
KEGPRVLNGNPQQEDGGLAG ITEQQSGAFDRWEM I QAQEL\HNK 
LKIKQNLQQLNSDISAITTWLKXTEAELEMLKMAKPPSDIQEIE 
LRVKRLQE I LKAFDT YKALWS VNVSS KE FLQTES PES TELQSR 
LRQLSLLWEAAQGAVDSWRGGLRQSLMQCQDFHQLSQNLLLWLA 
SAKNRRQKAHVTDPKADPRALLECRRELMQLEKELVERQPQVDM 
LQE I SNS LL I KGHGEDC I EAE E KVHV I \ E KKLKQLRE Q VSQDLM 
ALQGTQNPASPLPSFDEVDSGDQPPATSVPAPRAKQFRAVRTTE 
GEEETESRVPGSTRPQRSFLSRVVRAALPLQLLLLLLLLLACLL 
PSSEEDYSCTQANNF\ARSFYPMLRYTNGPPPT 


5139 


52 


1131 


LGDWVWSRTCGVLETPTSVLRRARARGPCPTDSKWALPRLREGE 
TERRPWEASSWKTL/LAGWIGGAASVIVGHPLDTVKTRLQAGVG 
YGNTLS C IRWYRRESMFGFFKGMS FPLAS I AVYNS WFGVFSN 
TQRFIjSQHRCGEPEASPPRTLSDLLLASMVAGWSVGLGGPVDL 
I K I RLQMQTPP VSGRQ PR FEVQGSGS CG \ EP AYQG PVHC I TT I V 
RNEGLAGLYRGASAMLLRDVPGYCLYFIPYVFLSEWITPEACTG 
PS P CAVWLAGGMAGAI S WGTATPMD WKS RLQ ADGVYLNKYKG V 
LDCISQSYQKEGLKVFFRGITVNAVRGFPMSAAMFLGYELSLQA 
IRGDHAVTSP 


6140 


694 


13 6 


RPELELWRLRSRSWRPLGVPRRCHRRNWKEPVRAQPLSVTVWAP 
RCQRP/QPPAPEPSSPNAAVPEAIPTPRAAASAALELPLGPAPV 
SVAPQAEAEARSTPGPAGSRLGPETFRQRFRQFRYQDAAGPREA 
FRQLREL/SPRQWLRPDI \RTKEQ\ I VEMLVQEQLLAILPEAAR 
ARRIRRRTDVRITG 




2 


984 


AQ VG PRSR P C KMPLKLRGKKKAKS KE TAGLVEGE PTGAGGGS LS 
ASRAPARRLVFHAQLAHGSATGRVEGFSSIQELYAQIAGAFEIS 
PS E I LYCTLNTP KIDMERLLGGQLGLEDFI FAHVKG IE KE VNVY 
KSEDSLGLTITDNGVGYAFIKRIKDGGVIDSVKTICVGDHIESI 
NGEN I VGWRH YD VAKKL KE L KKEEL F TMKL IE P KKAFE I ELRS K 
AGKSSGEKIGCGRATLRLRSKGPATVEEMPSETKAK\AIEKIDD 
VLELYMGIRDIDLATTMFEAGKDKVNPDEFAVALDETLGDFAFP 
DEFVFDVWGVIGDAKRRGL 


6142 


116 


602 


EAEGEQVCGAKCCGDAPHVENREEETARIGPGVMESKEERALNN 
LIVENVNQENDEKDEKEQVANKGEPLALPLNVSEYCVPRGNRRR 
FRVRQPILQYRWDIMHRLGEPQARMREENMERIGEEVRQLMEKL 
REKQLSHSLRAVSTDPPHHDHHDEFC\ LMP 


6143 


2802 


270 


FRMRIFLHCPWNQQMWK1WNLLETSLESCKAHLSIQKLLKER\Q 
\QLPVFKHRDSIVETLKRHRWWAGET\GSGKSTQVPHFUiED 
LLLNE WE ASKCN I VCTQ PRRI S AVS LANR VCDBLGCENGPGGRN 
oui_ux viWMlloKAUllblKijJjiCTTGVIjLRKLjU 
FIVDEV\HER\SVQSDFLLIILKEILQKRSDLHLILMSATVDSE 
KFSTYFTHCP ILRISGRS YPVEVFHLEDI IEETGFVLEKDSEYC 
QKFLEEEEEVTINVTSKAGGIKKYQEYIPVQTGAHADLNPFYQK 
YSSRTQHAI L YMN PHKINLDL I LELLAYLDKS PQ FRN I EGAVL I 
FL PGLAH I QQL YDLLSNDRR FYS ER Y KVT ALHS I LS TQDQAAAF 
TliPPPGVRKIVLATNIAETGITIPDWFVIDTGRTKENKYHESS 
QMS S L VE TF VSKAS ALQRQGRAGR VH DG FCFRM YTR ER FEG FMD 
YS VPE I LRVPL E EL CLH IMKCNLGS PE DFLS KALDP PQLQV I SN 
AMNLLRKIGACELNEPKLTPLGQHLAALPVNVKIGKMLIFGAIF 
GCLDPVATLAAVMTEKSPFTTPIGRKDEADLAKSALAMADSDHL 
T I YNAYLG WKKARQEGG YRS E I TYCRRNFLNRTSLLTLEDVKQE 
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LIKLVKAAGFSSSTTSTSWEGNRASQTLSFQEIALLKAVLVAGL 
YDNVGKI I YTKSVDVTEKLACI VETAQGKAQVHPSS VNRDLQTH 
GWLLYQEKI R YARVYLRETTL ITP FPVLLFGGDI EVQHRERLLS 
IDGWI YFQAPVK I AV I FKQLRVLI DSVLRKKLENPKMS LENDKI 
LQIITELIKTENN 


6144 


1289 
» 


568 


SGPGSMSGQRVDVKWMLGKEYVGKTSLVERYVHDRFLVGPYQN " 
VSASGGARHGGRGSGGP VICT YGPDLFPL VA\ TIGAAFVAKVMS 
VGDRTVTLG I WDTAGS ER YEAMS R I YYRG AKAAI VC YDLTDSS S 
FE RAKF WVKELRS LEEG CQI YL CGTKSDLL EED RRRRR VDFHD V 
QD YADN I KAQL FE TS S KTGQS VDELFQ KVAE D YVS VAAFQ VMTE 
DKGVDLGQKPNPYFYSCCHH 


6145 


1109 


196 


GGMDLSELERDNTGRCRLSSPVPAVCRKEPCVLGVDEAGRGPVL 
GPMVYAICYCPLPRLADLEALKVADSKTLLESERERLFAKMEDT 
DFVGWALDVLSPNLISTSMIjGRVKYNLNSLSHDTATGLIQYALD 
QGVNVTQVFVDTVGM PET YQARLQQS FPG I EVTVKAKADALYPV 
\ VSAAS I CAKVARDQAVKKWQFVEKLQDLDTD YG\SG YPNDPQD 
/TKAWLKEHVEPVF\GFP\QFVRF\SWRTAQTI\LEKEAEDVIR 
EDS AS ENQEGLRKI TS YFLNEGS Q ARPRS S HR YFLERGLES TTS 
L 


6146 


428 


781 


LKKKGKEKAEAQQVEALPGPSLDQWHRSAGEEEDGPVLTDEQKS 
R/ YPGHEAHDQGG\ WDARQS I IRKWDPETGRTRLI KGDGEVLE 
E I VTKERHRE INKQATRGDCLAFQMRAGLLP 


6147 


1 


2304 


GTRQLPPPSPGSGPGDSPEGPEGEAPERRRKAHGMLKLYYGLSE 
GEAAGRPAGPDPLDPTDLNGAHFDPEVYLDKLRRECPLAQLMDS 
ETDMVRQIRALDSDMQTLVYENYNKFISATDTIRKMKNDFRKME 
DEMDRLATNMAVITDFSARI SATLQDRHER I TKLAGVHALLRKL 
QF L FE LPS RLTKC VELGA YGQAVR YQGRAQAVLQQ YQHLPS FRA 
IQDDCQVITARLAQQLRQRFREGGSGAPEQAECVELLLALGEPA 
EELCEE FLAHARGRLEKELRNLEAELGPSPPAPDVLEFTDHG\ S 
SGFVGGLCQVAAAYQELFAAQGPAGAEKLAAFARQLGSRYFALV 
ERRLAQEQGGGDNS LLVRALDRFHRRLRAI>GALLAAAGLADAAT 
E I VER VARERLGHHLQGLRAAFLGCIjTD VRQALAAPRVAG KEG P 
GIAELLANVASSILSHIKASLAAVHLFTAKEVSFSNPCPYFRGEF 
CSQGVREGLIVGFVHSMCQTAQSFCDSPGEKGGATPPALLLLLS 
RLCLDYETATISYILTLTDEQFLVQDQFPVTPVSTLCAEARETA 
RRLLTHYVKVQGLVI SQMLRKS VETRDWLSTLE PRNVRAVMKR V 
VED TTA I DVQVLPRLAGVALTQAGGTVP SRGAGAAEDHWQ SL PG 
GGDMC I WASHGAS S VARAS VREPQGNKS PRMNTKRAGECLCPRS 
CSFSAQD YDI FAP ILPVE KQRLRVTQEVRAGLyiiVLKI RPQTNS 
CILPLPHSTGS INSDHVPTK 


6148 


305£ 


353 


VPAVGGTFADGAMGEAEKFHYIYSCDLDINVQLKIGSLEGKREQ 
KSYKAVLEDPMLKFSGLYQBTCSDLYVTCQVFAEGKPLALPVRT 
S YKAFS TRWNWNE WL KLPVKY PDLPRNAQ VALT I WD VYGPGKAV 
PVGGTTVSLFGKYGMFRQGMHDLKVWPNCRSQMDQKPTKTPGRT 
ob 1 IjoEL^MSRLAKLTKAHRQGHMV J EMINES 
VKRSSNFMYLMGGFRCVKCDDKEYGIVYYEKDGDESSPILTSFE 
LVKVPDPQMSLENLVESKHHNLPRSLRSGPSDHDLKPYPSPRDQ 
L KNXVS Y P PS KP P T YEEQDL VWE FRY YLTNQDKALTKI LTS V I W 
DLPQGAKQALALLGKWKPMDVEDSLELLSSHYTNPTVRRYAVAR 
LRQADDEDLLMYLLQLVQALKYENFDDIKNGLEPTKKDSQSSVS 
ENVSNSGINSAEIDSSQIIT/SAPFPSVSSPPP\ASKTKEVPDG 
ENLEQDLCTFLISRASKNSTIiANYLYWYVIVECEDQDTQQRDPK 
THEMYLNVMRRFSQALLKGDKSVRVMRSLLAAQQTFVDRLVHLM 
KAVQRESGNRKKKNERLQALLGDNEKMNLSDVELI PLPLE PQVK 
IRGI I PETATLFKSALMPAQLFFKTEDGGKYP VI FKHGDDLRQD 
QL I LQ 1 1 S LMDKLLRKENLDLKLT P YKVLATS TKHG FMQF I QS V 
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P VAE VLDT EGS I QNF FRK YAPS ENGPNG I SAEVMDT YVKS CAG Y 
CVI TYILGVGDRHLDNLLLTKTGKLFH IDFGYI LGRDPKPLP P P 
MKLNKEMVEGMGGTQSEQYQEFRKQCYTAFLHLRRYSNLILNLF 
SLMVDANI PDIALEPDKTVKKVQDKFRLDLSDEEAVHYMQSLID 
ESVHALFAAWEQIHKFAQYWRK 


6149 


1 


1413 


RVD PRVRENGTANP I KNGKTS PAS KDQRTGKKTS VQGQVQKGND 
ESESDFESDPPSPKSSEEEEQDDEEVLQGEQGDFNDDDTEPENL 
GHRPLLMDSEDEEEEEKHSSDSDYEQAKAKYSDMSSVYRDRSGS 
GPTQDLNT I LLTSAQLS SD VA VETPKQE FDVFGAVPFFAVRAQQ 
PQQEKNEKNLPQHRFPAAGLEQEEFDVFTKAPFSKKVNVQECHA 
VGPE AHT I PG YPKS VDVFGSTP FQ PFLTS TS KS ES NEDLFGL VP 
FDEITGSQQQKVKQRSLQKLSSRQRRTKQDMSKSNGKRHHGTPT 
STKKTLKPTYRTPERARRHKKVGRRDSQSSNEFLTISDSKENIS 
VALTDGKDRGNVLQPEESLLDPFGAKPFHSPD\LSWHPP\HQGL 
S \DIRADHNT\VLPGR\ PRQNSLHGSFHSADVLKMDDFGAVP / F 
LTELWQSITPHQSQQSQPV\ELDPFGAAPFPSKQ 


6150 


372 


37 


MSNI KKYI IDYDWKAS I E IEIDHDVMTEEKLHQINNFWSDSE YR 
LNKHG S VLNAVL I MLAQHALL I AI S S DLNAYG WCE FDWNDGNG 
QEGWPPMDGSEGIRITDIDTSGIF 


6151 


1555 


521 


DSNQQSVSGTAASTLLHSFKATIYYQGTGHVQQFYGVTSPYSQT 
TPP I VQS YAQPSLQYI QGQQI FTAHPQGVWQPAAAVTT I VAPG 
QPQPLQPS EMWTNNLLDLPPPS P PKPKT I VLPPNWKTARDPEG 
KI Y Y YHV I TRQTQ WDP PTWE S PGDDASLEHEAEMDLGTPT YDEN 
PMK\ AS KKP KTAEADTS SELAKKSKE VFRKEMS QFI VQCLNPYR 
KPDCKVG\RI TTTEDFKHLARKLTHGVMNKELKYCKNPE \ DLEC 
NENVKHKT KE Y I KKYMQ KFGAVY KP KEDTE FR VTVGPG WE DGWS 
GKTDSRERKSCGPFCSTPVSTVLLMIHHPGBFNPADW 


6152 


1366 


648 


NRTWSTPSTWMGVALPPLCSTGPWPVTRQITARTTCGAVPAKCP ' 

PWC/DVHEPRCQPPDCHGHGTCVDGHCQCTGHFWRGPGCDELDC 

GPSNCSQHGLCTETGCRCDAGWTGSNCSEECPLGWHGPGCQRPC 

KCEHHCPCDPKTGNCSVSRVKQCLQPPEATLRAGELSFFTRTAW 

LALTLALAFLLLISTAANLSLLLSRAERNRRLHGDYAYHPLQEM 

NGEPLAAEKEQPGGAHNPFKD 


6153 


2 


3368 


GRVGARS PGRAYAliLLLLI CFNVGSGLHLQVLSTRNENKLLP KH 
PHLVRQKRAW I TAPVALLEGEDLSKKNP IAKIHS DLAEERGLKI 
TYK YTG KG I TE P P FG I FVFNKDTGELNVTS I LDR E ETP F FLLTG 
YALDARGNNVE KPLELR I KVLDI NDNEP VFTQDVFVGSVEE LS A 
AHTLVMKINATDADEPNTLNSKISYRIVSLEPAYPPVFYLNKDT 
GEIYTTSVTLDREEHSSYTLTVEARDGNGEVTDKPVKQAQVQIR 
I LDVNDNI PWENKVLEGMVEENQVNVE VTRI KVFDADE IGSDN 
WLANFTFASGNEGGYFHIETDAQTNEGIVTLIKEVDYEEMKWLD 
FS VI VANKAAFHKS IRSKYKPTP IPI KVKVKNVKEGIHFKSSVI 
S I YVSESMDRSSKGQI IGNFQAFDEDTGLPAHARYVKLEDRDNW 
ISVDSVTSEIKLAKLPDFESRYVQNGTYTVKIVAISEDYPRKTI 
TGTVL INVEDINDNCPTLI EP VQTICHDAEYVNVTAEDLDGHPN 
SGPFSFSVIDKPPGMAEKWKIARQESTSVLLQQSEKKLGRSEIQ 
FL I SDNQGFS CPEKQVLTLTVCE VLHGS \ GCREAQHDS YVGLGP 
AAIALM I LAFLLLL LVPLLLLMCHCGKGAKGFTP I PGT I EMLHP 
WNNEGAPPEDKWPSFLPVDQGGSLVGRNGVGGMAKEATMKGSS 
SAS I VKGQHEMS EMD GRWEEHRS LLSGRATQFTGATGAI \MTTE 
TT ITARATGAS RDVAGAQAAAVALNEE FLKN YFTDKAAS YTEED 
ENHTAKDCLLVYSQEETESLNASIGCCSFIEGELDDRFLDDLGL 
KFKTLAEVCLGQKIDINKEIEQRQKPATETSMNTASHSLCEQTM 
VNSENTYSSGSSFPVPKSLQEAWAEKVTQEIVTERSVSSRQAQK 
VATPLPDPMASRNVIATETSYVTGSTMPPTTVILGPSQPQSLIV 
TERVYAPASTLVDQPYANEGTVWTERVIQPHGGGSNPLEGTQH 



457 



WO 01/53312 



PCT/US00/34263 



ID 
NO: 


Predi c ted 

beginning 

nucleotide 

location 

c or re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
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LQDVP YVMVRERES FLAPSSGVQPTLAMPNI AVGQNVTVTERVL 
APASTLQSS YQ I PTENSMTARNTTVS GAGVPG PLPDFGLEESGH 
SNSTITTSSTRVTKHSTVQHSYS 


6154 


3660 


2146 


KKKTKMKNTLQKTVNFGAWPKPTISDKSHLLQMVSKLDLTDAKN 
SDTAHIKSIEITSILNGLQASESSAEDSEQEDERGAQDMDNNGK 
E ES KI DHLTNNRNDL I S KEEQNS S SLLEENKVHADL VI S KP VS K 
S PERLRKD IE VLSEDTD YEEDEVTKKRKDVKKDTTDKS S KPQI K 
RGKRRYCNTEECLKTGSPGKKEEKAKNKESLCMENSSNSSSDED 
EEETKAKMTPTKKYNGLEEKRKSLRTTGFYSGFSEVAEKRIKLL 
NNS DERLQNSRAKDRKDVWS S IQGQW P KKTLKELFSDSDTEAAA 
S PPHPAPEEGVAEESLQTVAEEESCS PS VELEKPPPVNVDS KP I 
EE KT VE VNDRKAEFPSS G SN FS A* I P L P YLHLNRLHQS L * Q KGS 
RQQSSVTVSEPLAPNQEEVRSIKSETDSTIEVDSVAGELQDLQS 
ERE*LASRF* CQCELEQ * * SARTRTS * KSLYRSEKSERCSGRRK 
FIKKAEKKP* SNSGKQQKEGK 


6155 


869 


121 


HLLPELRGKS WITMKYVFYLGVLAGTFFFADS S VQKEDPAP YLV 
YLKSHFNPCVGVL I KPS WVLAPAHCYLPNLKVMLGNFKSRVRDG 
TEQT I NP I Q I VR Y WNYS H S APQDDLM L I KLAKP AMLNP KVQALN 
P \ PTTNWPGTVCLLSGLDWS QENSGRHPDLRQNLEAP VMS DRE 
CQ KTE QG KSHRN S LCVK FVKVFSR I FG E VAVAT V I CKD KLQG I E 
VGHFMGGDVG I YTNVYKYVS WI ENTAKDK 


6156 


5725 


3984 


GTSTVTKATKKHFS I ILNLLGMLLKKDNQDTRKLLMTWALE VAV 
VMKKS ET YAPLFCLPS FH KF C KGLLADTLVED VN I CLQACS S LH 
ALSSSLPDDLLQRCVDVCRVQLVHRGTCIRQAFGKLLKSIPLGV 
FLSNNNHTEIQEISLALRSHMSKAPSNTFHPQDFSD/VISFILY 
GNSHRTGKDNWLERLFYSCQRLDKRDQSTIPRNLLKTDAVLWQW 
AIWEAAQFTVLSKLRTPLGRAQDTFQTIEGIIRSLAGHTLNPDQ 
DVSQWTTADNDEGHGNKQLRLVLLLQYLENLEKLMYNAYEGCAN 
ALTSPPKVIRTFLYTNRQTCQDWLTRIRLSIMRVGLLAGQPAVT 
VRHGFDLLTEMKTTS LSQGNELEVS I MMWEALCELHCPEAI QG 
IAVWSSSIVGKHLLWINSVAQQAEGRFEKASVEYQEHLCAMTGV 
DCCIS S FDKSVLTLAS AGCKSASLKHCLNGESRKS VLSKPTDS S 
PEVINYLGNKACECYISTADWAAVQEWQNAIHDLKKSTSSTSLN 
LKADFNYIKSLSSFESGKFVECTEQLELLPGENINLLAGGSKEK 
IDMKKLLRNM 


6157 


946 


329 


MANRGPSYGLSREVQEKIEQKYDADLENKLVDWIILQCAEDIEH 
P P PGRAHFQKWIiMDGTVLCKL I NS L YP PGQE PIPKISES KMAFK 
QMEQI SQFLKAAETYGVRTTDI FQTVDLWEGKDMAAVQRTLMAL 
GSVAVTKDDGCYRGEPSWFHRKAQQNRRGFSEEQLRQGQNVIGL 
QMGSNKGASQAGMTGYGMPRQIM*DAASCP 


6158 


441 


1482 


LGSL I VLSLHCKVI FSSQSLERAMKE KAVDLVP I LAQNPGLAQN 
PILEGKDHNQNTGVDPIIDHVQDRKTD/ SRSKS PHKKRSKSRER 
RKSRSRSHSRDKRKDTREKIKEKERVKEKDREKEREREKEREKE 

QDKEKEREKDRSKEIDEKRKKDKKSRTPPRSYNASRRSRSSSRE 
RRRRRSRSS SRS PRTS KTI KRKS SRS PS PRSRNKKDKKREKERD 
HISERRERERSTSMRKSSNDRDGKEKLEKNSTSLKEKEHNKEPD 
S S VS KEVDDKDAPRTEENKIQHNGNCQLNEENLSTKTEAV 


£159 


53 . 


84 


AVIAPLHISLGDRARPYLKNTEKSSTTCSRRRKQSFPPVMSLTH 
RLHLCKYWGCAVSNVCRFWEGRPLPLMIWPYTLPVSLPVGSCV 
I I TGTP I LT FVKDPQLEVNF YTGMDEDS D I AFQ FRLH FGHPA I M 
NS CVFGI WR YEE KC Y YLP FEDGKPFELCI YVRHKE YKVMVNGQR 
I YNFAHRF P PAS VKMLiQ VFRD I S LTRVL I SD *GRC VR I TAVQE F 
DVSVSCDCTTAYQPG 


<Jl*0 


1626 


1790 


AGAKFFP* F*KVADAQPTBSEKEI YNQVNVVLKDAEGILFJDLQS 
YRGAGHE IREA I QH PADEKLQE KAWGAWPLVGKLKKFYE FS QR 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - "" 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S= Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEAALRGLLGALTSTPYSPTQHLEREQALAKQFAEILHFTLRFD 
ELKMTNPAIQNDFSYYRRTLSRMRINNVPAEGENEVNNELANRM 
SLFYAEATPMLKTLS DATTKFVS ENKNLP I ENTTDCLSTMAS VC 
RVMLETPE YRS RFTNE ETVSFCLRVMVGVI I LYDHVHPVGAFAK 
TS KI DM KG C I KVLKDQ P PNS VEGLLNALR YTTKHLNDE TTSKQ I 
KSMLQ*QLLTLVNKG 


6161 


455 


1569 


P VS GSES S LRRAWAS I LRLMIiG PRVAVS I LCEDG I SH * LLEKH * 
KSHVLEPLSSLALEBQCLtALSIiDWSTGKTGRAGDOPLJC T T qcnq 
TGQLHLLMVNETRPRLQKVASWQAHQFEAWIAAFNYWHPEIVYS 
GGDDGLLRGWDTRVPGKFUFTSKRHTMGVCSIQSSPHREHILAT 
GS YDEHILLWDTRNMKQPLADTPVQGGVWRI KWHPFHHHLLLAA 
CMHSGFKILNCQKAMEERQEATVLTSHTLPDSLVYGADWSWLLF 
RSLQRAPSWSFPSNLGTKTADLKGASELPTPCHECREDNDGEGH 
ARPQSGMKPLTEGMRKNGTWLQATAATTRDCGVNPEEADSAFSL 
LATCS F YDHALHLWE WEGN 


6162 


1 


586 


RT IHATGRAGAS PMHRL I VWRDAEANKQHVRCQKCLEFGHWTYE 
CTGKRKYLHRPSRTAELKKALKEKENRKLLQQS IGETNVERKAK 
KKRSKSVTSSSSSSSDSSASDSSSESEETSTSSSSEDSDTDESS 
SSSSSSASSTTSSSSSDSDSDSSSSSKQ*HQHR*QL*R+TTKEE 
EKE I ELLHS Y WTDGLKTLM 


6163 


1081 


785 


R I RSTTEGCAVRLHPTQNTGKARIM I LLSVS LGRHWAFT YKFFL 
TPWFVFFFFFFHRKE*VMQKNPMKSREDEWMEKLNNLHVQRAD 
MNRLI MNYLVTEGFKEAAEKFRMESG I EPSVDLETLDERI KI RE 
MILKGQIQEAIALINSLHPELLDTNRYLYFHLQQQHLIELIRQR 
ETEAALSFAQTQLAEQGEESRECLTEMERTLALLAFDSPEESPF 
GDLLHTMQRQKVWSEVNQAVLDYENRESTPKLAKLLKLLLWAQN 
ELDQKKVKYPKMTD LSKGVI EE PK 


6164 


90 


406 


PCQS PGRS RMRQD KLTGS LRRGGRCLKRQGGG VGT I LSNVL KKR 
S C I SRTAP RLLCTLE PG VDTKLKFTLE PS LGQNGFQQWYDAL KA 
VARLS TG I PKEWRRKVWLTLADHYLHS I AIDWDKTMRFTFNERS 
NPDDDSMGIQIVKDLHRTGCSSYCGQEAEQDRVVLKRVLLAYAR 
WNKTVGYCQGFNIIiAALILEVMEGNEGDALKIMIYLIDKVLPES 
YFVNNLRALSVDMAVFRDLLRMKLPELSQHLDTLQRTANKESGG 
GYEPPLTNVFTMQWFLTLFATCIiPNQTVLKIWDS VFFEGSE I IL 
RVSLAIWAKLGEQIECCETADEFYSTMGRLTQEMLENDLLQSHE 
LMQTVYSMAPF PFPQLAEIiREKYTYNI TP FPATVKPTS VSGRHS 
KARDSDEENDPDDEDAWNAVGCLGP FSG FLAPELQKYQKQ I KE 
PNEEQ S LRS NNI AELS PGAINS CRSE YHAAFNSMMMERMTTD I N 
ALKRQYSRIKKKQQQQVHQVYIRADKGPVTSILPSQVNSSPVIN 
HLLLGKKMKMTNRAAKNAVIHIPGHTGGKISPVPYEDLKTKLNS 
PWRTH I R VHKKNMPRTKSHPG CGDTVGL I DEQNEAS KTNGLGAA 

EAFPSGCTATAGREGSSPEGSTRRTIEGQSPEPVFGDADVDVSA 
VQAFCLGALELNQRDAAAETELRVHPPCQRHCPEPPSAPEENKAT 
SKAPQGSNSKTPIFSPFPSVKPLRKSATARNIiGLYGPTERTPTV 
HFPQMSRSFSKPGGGNSGP*KMVFSSGTMLSRQLPGYPQEYQRN 
GGERFG 


6165 


90 


406 


PCQS PGRSRMRQD KLTGS LRRGGRCIjKRQGGG VGT I LSNVLKKR 
S C I S RTAPRLLCTLEPGVDTKL KFTLEPS LGQNGFQQ WYDAL KA 
VARLSTG I PKEWRRKVV^TLADHYLHS I AIDWDKTMRFTFNERS 
NF DDDS MG I Q I VKD LHRTGCS S YCGQE AEQDR WLKRVLLAYAR 
WNKIVGYCQGFNILAALILEVMEGNEGDALKIMIYLIDKVXiPES 
YFVNNLRALS VDMAVFRDLLRMKL PE LS QHLDTLQRTANKE SGG 
GYEPPLTNVFTMQWFLTLFATCLPNQTVLKIWDSVFFEGSEIIL 
RVSLAIWAKLGEQIECCETADEFYSTMGRLTQEMLENDLLQSHB 
LMQTVYSMAP F P FPQLAE LREKYT YN I TPFPATVKPTS VS GRHS 
KARDSDEENDPDDEDAWNAVGCLG P FSGFLAPELQKYQKQI KE 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= ' 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V« Valine, 
W=Tryptophan, Y»Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PNEEQSLRSNUIAELSPGAINSCRSEYHAAFNSMMMERMTTDIN 
ALKRQYSRIKKKQQQQVHQVYIRADKGPVTSILPSQVNSSPVIN 
HLLLGKKMKMTNRAAKNAVTH I PGHTGGKIS PVPYEDLKTKLNS 
PWRTHIRVHKKNMPRTKSHPGCGDTVGLIDBQNEASKTNGLGAA 
EAF PS G CTATAGREGS S PEGSTRRTI EGQSPEPVFGDADVDVSA 
VQAKLGALELNQRDAAAETELRVH P P CQRH CPE PP S APE ENKAT 
SKAPQGSNSKTPIFSPFPSVKPLRKSATARNLGLYGPTERTPTV 
HFPQMSRSFSKPGGGNSGP*KMVFSSGTMLSRQL?GYPQEYQRN 
GGERFG 


6166 


2 


i 1206 


HKLWRTVAMAGAEWKSLEECLEKHLPLPDLQEVKRVLYGKELRK 
LDLPREAFEAASREDFBLQGYAFEAAEEQLRRPRIVHVGLVQNR 
I PLPANAPVAEQVS ALHRR I KAI VE VAAMCGVN 1 1 CFQEAWTMP 
FAFCTREKL P WTEFAES AE DG PTTRF CQKLAKNHDM VWS P I LE 
RDSEHGD VLWNTAWISNS GAVLGKTRKNH I PRVGDFNESTYYM 
EGNLGHP VFQTQ FGRIAVNI CYGRHHPLNWLM YS INGAE 1 1 FNP 
SATIGALSESLWPIEARNAAIANHCFTCAINRVGTEHFPNEFTS 
GDGKKAHQDFG Y FYGSS YVAAPDS S RT PGLS RSRDGLLVAKLDL 
NLCQQVNDVWNFKMTGRYEMYARELAEAVKSNYS PTI VKE * PAS 
VPALG 


6167 


1220 


1844 


YGIVTGPSLCAGDKQPKKQEKNPVLVSPEFVDEALCACEEYLSN 
LAHMD I DKDLEAPLYLTPEGWSLFLQRYYQWHEGAELRHLDTQ 
VQRCEDILQQLQAWPQIDMEGDRNIWIVKPGAKSRGRGIMCMD 
HLEEMLKLVNGNPWMKDGKWWQKY I ERPLL I FGTKFDLRQWF 
LVTDWNPLTVWFYRDSYIRFSTQPFSLKNLDK*APLYLTPEGWS 
LFLQRYYQWHEGAELRHLDTQVQRCED I LQQLQAWPQI DMEG 
DRNI W I VKPGAKSRGRG I MCMDHLE EML KLVNGNP WMKDGKWV 
VQKYIERPLLIFGTKFDLRQWFLVTDWNPLTVWFYRDSYIRFST 
QPFSLKNLDK 


6168 


84 


1392 


VWPVPSVSAMPPKKQAQAGGSKKAEQKKKEKIIEDKTFGLKNKK 
GAKQQKFIKAVTHQVKFGQQNPRQVAQSEAEKKLKKDDKKKELQ 
ELNEL FKP WAAQ KI S KG AD P KS WCAFFKQGQ CTKGDKCKFSH 
DLTLERKCEKRSVYIDARDEELEKDTMDNWDEKKLEEWNKKHG 
EAEKKKPKTQ I VCKHFLEAI ENNK YGWFWVCPGGGD I CMYRHAL 
PPGFVLKKKKKKKKKEDEISL*DLIERERSALGPNVTKITLESF 
LAWKKRKRQEKIDKLEQDMERRKADFKAGKALVISGREVFEFRP 
ELVKDDDEEADDTRYTQGTGGDE VDDS VS VND IDLS LYI PRDVD 
ETGITVASLERFSTYTSDKDENKLSEASGGRAENGERSDLEEDW 
EREGTENGAIDAVPVDENLFTGEDLDELEEELNTLDLEE 


6169 


112 


662 


APAAAMAERPEDLNLPNAVI TR 1 1 KEALPDGVN I S KEARSAI SR 
AAS VF VLYATS CANNFAMKG KRKTLNASD VLSAME EME FQR FVT 
PLKEALEAYRREQKGKKEASEQKKKDKDKKTDSBEQDKSRDEDN 
DEDE BRLE EEE QNE EE EVDN* KGRET VAP WKVPLEMRRATCFCE 
AFPCWAE 


6170 


62 


667 


STKVMLPNTGRLAGCTVFITGASRGIGKAIALKAAKDGANIVIA 
A KTAQ PH P KLLGT I YTAAE E I EAVGG KALPC I VD VRDEQQI S AA 
VEKAIKKFGGIDILVNNASAISLTNTLDTPTKRLDLMMNVNTRG 
TYLASKACIPYLKKSKVAHIPNISPPLNLNPVWFKQHCGRW*W 
G * GDGLCL I CF ELNLCMS D V I T I CT 


6171 


382 


941 


HFMQSDVEIiDCDIEPCGHTKFPPTLPLSTTVIVCSCHPVATAST 
MAEAFSKTTSEEDQSIQEPKEANSMTAQKQKK*GLRGSRRRHAN 
SGGDI FGDS FAAYFPRVLKQVHQALSLSQEAVSVMDSMVRDILD 
R IATEAGHliAH YS KC VTI TSRDI RMAVCLLLPGKMGKLAESQGT 
NATLRYTKSK 


(Sl72 


651 


54 


GLCRAGOAHRFSRTHVEAALKMIiRREARLRREYLYRKAREEAQR 
S AQERKERLRRALE ENRL I P TELRRE ALALQGSLE FDDAGGEGV 
TS HVDDE YR WAGVEDPKVM I TTSRDPS SRLKMFAKELKLVFPGA 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








QRMNRGRHE VGALVRACKANGVTDLLVVHEHRGTPVGL I VSHLP 
FGPTAYFTLCNWMRHDIPDLGTMSEAKPHLITHGFSSRLGKRV 
SDILRYLFPVPKDDSHRVITFANQDDYISFRHHVYKKTDHRNVE 
LTEVGPRFELKLYM I RLGTLEQEATADVEWRWHP YTNTARKRVF 
LSTE*AAPRPLGQLL 


6173 


3 


288 


SVDHREVQVLSQSMPLTPHQAVLRGERPYMCVECGKCFGRSSHL 
LQHQRIHTGEKPYVCSVCGKAFSQSSVLSKHRTIHTGEKPYECN 
ECGKAFRVSSDLAQHHKIHTGEKPHECLECRKAFTQLSHLIQHQ 
R I HTGER P Y VC PLCG KAFNHS T VLRSHQR VHTGE KP HRCNECGK 
TFSVKRTLLQHQRIHTGEKPYTCSECGKAFSDRSVLIQHHNVHT 
GEKPYECSECGKTFSHRSTLMNHERIHTEEKPYACYECGKAFVQ 
HSHLIQHQKVHRKL* PTCVLS VGSALAGVPTS FS ISVSTLERSP 
MCAVYVGRPSARAQSIjVNTGQFTQVRSPMSVMSVEKPLE 


6174 


1060 


959 


PRPPGKRWMVAGbGNPGLPGTRHSVGMAVLGQLARRLGVAESWT 
RDRHCAADLAIAPLGDAQLVLLRPRRLMNANGRSVARAAELTOL 
TAE E V YL VHDE LDKP LGRLAL KLGGS ARGHNG VRS CIS CLNSNA 
MPRLRVGIGRPAHPEAVQAHVLGCFSPAEQELLPLLLDRATDLI 
LDHIRERSQGPSLGP*H*WFSKKA 


6175 


2204 


334 


RYFRADPRSRSGQPRAEGLGAFAEGPLRAMAAPVKGNRKQSTEG 
DALD P PAS P KPAGKQNG I QNP I S LE DS PEAGGEREEE QERE EEQ 
AFLVSLYKFMKERHTPIERVPHLGFKQINLWKIYKAVEKLGAYE 
LVTGRRLWKNVYNELGGSPGSTSGATCTRRHY*RLVLPYVRHLK 
GEDDKPLPTSKPRKQYKMAKENRGDDGATERPKKAKEERRMDQM 
M PG KTKADAAD P APLPS QE PPRNS TEQQGLAS G S S VS FVGASGC 
PEAYKRLLS S FYCKGTHG I MS PLAKKKLLAQVS KVEALQCQEEG 
CRHGAEPQASPAVHLPESPQSPKGLTENSRHRLTPQEGLQAPGG 
SLREE AQAG PC PAAP I F KG CF YTHPTE VLKP VS QHPRD FFS RLK 
DGVLLG PPGKEGLS VKEPQLVWGGDANRPSAFHKGGSRKG I LYP 
KPKACWVSPMAKVPAESPTLPPTFPSSPGLGSKRSLEEEGAAHS 
GKR LRAVS P FLKEADAKKCGAKPAGSG LVSCLLG PALG P VP PEA 
YRGTMIiHCPLNFTGTPGPLKGQAALPFSPLVIPAFPAHFLATAG 
PSPMAAGLMHFPPTSFDSALRHRLCPASSAWHAPPVTTYAAPHF 
FHLNTKL 


6176 


1040 


402 


PLSALRAMAEVHVIGQI IGASGFSESSLFCKWGIHTGAAWKLIiS 
GVREGQTQVDTPQIGDMAYWSHPIDLHFATKGLQGWPRLHFQVW 
SQDS FGRCQLAGYGFCHVPS S PGTHQLACPTWRPLGS WREQLAR 
AFVGGG PQLLHGDT I YS GADR YRLHTAAGGTVHL E I G LLLRNFD 
R YGVEC * GTL P PTS PPST PRT PSDGGG WHSGQEHRL 


6177 


1400 


992 


VPIESLVGKVHNFPLIAFYCCEKGKRQPHKSLHDRCFGEALDPN 
CSHCYLDQIKRSDFLGFSGYSPHFVAISTNSEHKMQPSSMQQAL 
PSQ*PYWTDPRPALVPCCSHRPDVHRSRPGPGLPGTSGCSDRPP 
VCPI 


6178 


1027 


254 


STQRGG I KGVARAAS LVGRRRAGTGMALLLCLVCLTAALAHGCL " 
HCHSNFS KKFS F YRHHVN FKS W WVGD I P VS GALLTDWS DDTMKE 

N I FREQVHLIQNAI I ESR IDCQHRCGI FQ YETIS CNNCTDSHVA 
C FG YNCE S S AQWKS AVQGLLN Y INNWHKQDTSMR PRS S AFS WPG 
THRAAPAFLVLPALRCLEPPHLANLSLEDAA* CLKQH 


6179 


806 


276 


RGETREMAGNLLSGAGRRLWDWVPLACRSFSLGVPRLIGIRLTL 
P P P KWDRWNE KRAM FG VYDN I G I LGNFEKHPKEL I RGPI WLRG 
WKGNELQRCI RKRKMVGS RMFADDLHNLNKR I R YL YKH FNRHGK 
FR*KRKLRTSEKAHLSPWRRETVLFPVRKRLCIFSVIKWGFFGI 


6180 


156 


1833 


DHHILKAASTTHVCARGNIFAI PNTRCLEC*ATATPSSLECQN* 
SHLSLCPLPATTSGLTPNSMIPEKERQNIAERLLRVMCADLGAL 
SWSGKEFLKLAQTLVDSGARYGAFSVTEILGNFNTLALKHLPR 
MYNQVKVKVTCALGSNACLGIGVTCHSQSVGPDSCYILTAYQAE 
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SSQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c o r r e spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y*=Tyrosine, X=Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GNH I KS Y VLG VKGAD I RDS GDLVHHWVQN VLS E FVM SE IRTVYV 
TDCRVSTSAFSKAGMCLRCSACALNSWQSVLSKRTLQARSMHE 
VIELLNVCEDIiAGSTGLAKBTFGSLEETSPPPCWNSVTDSLLLV 
HERYEQICEFYSRAKKMNLIQSLNKHLLSNLAAILTPVKQAVIE 
LSNESQPTLQLVLPTYVRLEKLFTAKANDAGTVSKLCHLFLEAL 
KENFKVHPAHKVAMILDPQQKLRPVPPYQHEEIIGKVCELINEV 
KESWAEEADFEPAAKKPRSAAVENPAAQEDDRLGKNEVYDYLQE 
PLFQATPDLFQYWS CVTQKHTKLAKLAFWLIiAVPAVGARSGCVN 
MCEQALLI KRRRLLS PEDMNKLMFLKSNML 


6181 


169 


1032 


TRTLLSPVLLPGPRWKPWRRRPMGPLALPAWLQPRYRKNAYLFI " 

YYLIQFCGHSWIFTNMTVRFFSFGKDSMVDTFYAIGLVMRLCQS 

VSLLELLHI YVGIESNHLLPRFLQLTERI I ILFWITSQEEVQE 

KYWCVLFVFWNLLDMVRYTYSMLSVIGISYAVLTWLSQTLWMP 

IYPLCVLAEAFAIYQSLPYFESFGTYSTKI1PFDL1SIYFPYVLKI 

YLMMLFIGM YFTYSHLYSERRDI LGI FP I KKKKM* STAFQCDTR 

KDRLWIQCSK*NTGSILVEKFLVF 


6182 


1769 


1224 


AS*IDYQLNTI J LKEFQLTEENTKtRYLTCSLIEDMAAAYFPDCI " 
VRPFGSSVNTFGKLGCDLDMFLDLDETRNLSAHKISGNFLMEFQ 
VKNVPSERIATQK1LSVLGECLDHFGPGCVGVQKILNARCPLVR 
FSHQASGFQCDLTTNNRIALTSSELLYIYGALDSRVRALVFSVR 
CWARAHSLTS S I PGAW ITNFSLTMMVI FFLQRRSP P ILPTLDSL 
KTLADAEDKC V I EGNNCT F VRD LS R I KP S QNTE TLELLL FCEF FE 
YFGNFAFDKNSINIRQGREQNKPDSSPLYIQNPFETSLNISKNV 
SQSQLQKFVDLARESAW ILQQEDTDRPS ISSNRPWGLVSLLLPS 
APNRKS FTKKKSNKFAI ETVKNLLESLKGNRTENFTKTSGKRT I 
STQT 


6183 


1118 


452 


HLDRYIKSPGSGSSTPAPPSHLLLYLLHPQSTRTMGCCGCSRGC 
GSGCGGCGSSCGGCGSGCGGCGSGRGGCGSGCGGCSSSCGGCGS 
RCYVPVCCCKPVCSWVPACSCTSCGSCGGSKGGCGSCGGSKGGC 
GSCGCSQSS CCKP CCCSSGCGSS CCQSSCCKPCCCQSSCCVPVC 
CQSSCCKPCCCQSNCCVPVCCQCKI*GSGPRPSGFSCLVKAFLM 
VP 


6184 


1 


2191 


IVTVREEDGAPAVAPPGWVSRANKRSGAGPGGSGGGGARGAEE ' 
E P P P P LQAVLVADS FDRR F FP I S KDQ PRVLLPLANVAL I D YTL E 
FLTATGVQETFVFCCWKAAQIKEHLLKSKWCRPTSLNWRIITS 
ELYRSLGDVLRDVDAKALVRSDFLLVYGDVISNIWITRALEEHR 
LRRKL * KNVS VMTMI FKES S PSHPTRCHEDNVWAVDSTTNRVL 
HFQ KTQGLRR FAF P LSL FQG S S DG VEVR YDLLD CH I S I CS P QVA 
QLFTDNFDYQTRDDFVRGLLVNEEILGNQIHMHVTAKEYGARVS 
NLHMYSAVCADVIRRWVYPLTPEANFTDSTTQSCTHSRHNIYRG 
PEVSLGHGSILEENVLLGSGTVIGSNCFITNSVIGPGCHIEPGD 
NWLDQTYLWQG VR VAAGAQ I HQS LL CDNAEVKERVTLKPRS VL 
T S QVWGPNITLP EGS VI S LH PPDAE EDE DDGE FS DDSGADQE K 
DKVKMKG YNPAE VGAAGKG YIjWKAAGMNMEEEEELQQNL WGLK I 

GKEENISCDNLVLEINSLKYAYNISLKEVMQVLSHWLEFPLQQ 
MDSPLDSSRYCALLLPLLKAWSPVFRNYIKRAADHLEALAAIED 
F FLEHEALGISMAKVLMAFYQLE ILAEETI LSWFS QRDTTDKGQ 
QLRKNQQLQRFIQWLKEAEEESSEDD 


£185 


791 


44 


PCTSCVLWATLHLPASTRKAPQAECGMI S I TEWQKIGVGITGFG ' 
IFFILFGTLLYFDSVLLAFGNLiLFLTGLSLIIGLRKTFWFFFQR 
HKLKGTS FLLGGWIVUjRWPLLGMFLETYGFFSLFKGFFPVAF 
GFLGNVCNI PFLGALFRRLQGTSSMV* KTEMSSLNLDHWLKGAK 
REEWEPPPQSPALTHSPTYPGPPQVQKERNGAEQLTSNPQVDSR 
GCQEABMQTPRRLGWGWYHTLTLYLWEEK 


6186 


S>69 238 


VYGIDSSNTNTHGAEERNRKLKKHWKLCHAQSRLDWGLALKNA 



462 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 

JL c O J. uuc ui 

amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

ctuixiiu aviu 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
LsLeucine, M^Methionine, NsAsparagine, 
P-Proline, Q=Glutaroine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 

Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








KERKVKNKVKNKADTEEVFNNSPTNQEKMPTSAILPDFSGSVIS 
NIRNQMETLHSQPHQEENLCFENSFSLINLLPINAVEPTSSQQI 
PNRETSEANKERRKMTSKSSESNIYSPLTSFITADSELHDIIKD 
LEDCLMVGLHTCGDLAPNTLRI FTSNSE I KGVCSVGCCYHLLSE 
E FENQHKE RTQE KWGF PMCH YLKE ERWCCGRNARMS ACLALERV 
AAGQGLPTESLFYRAVLQDIIKDCYGITKCDRHVGKIYSKCSSF 
LDYVRRSLKKLGLDESKLPEKIIMNYYEKYKPRMNELEAFNMLK 
WLAPCIETLILLDRLCYLKEQEDIAWSALVKLFDPVKSPRCYA 
VIALKKQQ* FPLKQI IRCISL*DSAGCAEEVSVGDGGPALRDAP 
PSGSRVGSRYD 


6187 


1701 


771 


DAWGPETRLARILNPDSFIEPRPGRLPELEATRPHMEPKASCPA 
AAPLMERKPHVLVGVTGSVAALKLPLLVSKLLDIPGLEVAVVTT 
ERAKH F Y S P QDI P VTL YS DADEWE MWKS RS D PVLH I D LRRWADL 
LLVAP LDANTLGKVAS G I CDNLLT C VMRAWDRS KPLL FC PAMNT 
AMWEHP I TAQQVDQLKAFGYVE I PCVAKKLVCGDEGLGAMAEVG 
T I VDKVKE VLFQHSG FQQS * PG I S VMGVPL YS E WVQAKS VKMDV 
G K I GG Y PH LLNGGP AL S L PRGQACS RLNWTEGPGLS F FQ PGEAA 
A 


6188 


238 


1534 


KGFVNAGPLMAELQVSPQWKAPEMSQICLSCGHPSA*GPRWASW 
N IGVF I C I RCAG IHRNLG VH I S RVKS VNLDQ WTQEQ I Q CMQEMG 
NGKANRLYEAYLPETFRRPQIDPAVEGFIRDKYEKKKYMDRSLD 
INAFRKEKDDKWKRGSEPVPEKKLEPWFEKVKMPQKKEDPQLP 
RKSS PKSTAP VMDLLGLDAPVACS I ANSKTSNTLEKDLDLLAS V 
PSPSSSGSRKWGSMPTAGSAGSVPENLNLFPEPGSKSEEIGKK 
QLSKDSILSLYGSQTPQMPTQAMFMAPAQMAYPTAYPSFPGVTP 
PNS I MGSMMPP P VGMVAQPGASGMVAPMAMPAG YMGGMQASMMG 
VPNGMMTTQQAGYMAGMAAMPQTVYGVQPAQQLQWNLTQMTQQM 
AGMNFYGANGMMNYGQSMSGGNEQAANQTLSPQMWK 


6189 


1297 


793 


LGEPLGDLCELIPGDVQQLQMGEVHPGTGAQGSAAQSVAGEVQL 
TQLSHARQRPS CQGS QLIALDLQHMDI SRQPR WQHVQP VARQVQ 
RAQQAQLAEG VAVHL WAGDAWAE VELLQE VGGG KVFAANACDL 
WQDHEGAJHAARQATGHAIiQRVIVQVRRVQPLEAL*RVPSGLPR 
RVRAFMI LHNQ I TGIGREDFATTYFLEELNLS YNRITSPQVHRD 
AFRKLRLLRSLDLSGNRLHMLPPGLPRNVHVLKVKRNELAALAR 
GALAGMAQLRELYLTSNRLRSRALGPRAWVDLAHLQLLDIAGNQ 
LTE I P EGL P E S LE YL YLQNNKI SAVPANAFD S TPNLKG I FLRFN 
KLAVGSWDSAFRRLKHLQVLDIEGNLEFGDISKDRGRLGKEKE 
EEEEDEVEEEETR 


0JL7U 


c c 


1309 


I L VGNVS FLLS FAEYVCN CS WGS LNVNRCNQTTGQCE CR PG YQ 
GLHCETCKEGFYLNYTSGLCQPCDCS PHGALS I PCNSSGKCQCK 
VGVIGS ICDRCQDGYYGFSKNGCLPCQCNNRSASCDALTGACLN 
CQENS KGNHCE E CKEG F YQS PDAT KE CLRCP CS AVTSTG S C S I K 
SSELEPECDQCKDGYIGPNCNKCENGYYNFDSICRKCQCHGHVY 
PVKTPKICKPESGECINCLHNTTGFWCENCL*GYVHnT J EGNCTK 
KVILPTPEGSTILVSNASLTTSVPTPVINSTFTPTTLQTIFSVS 
TSENS TSALADVS WTQFNI I ILTVI 1 1 VWLLMG FVGAVYM YRE 
YQNRKLNAPFWTIELKEDNISFSSYHDSIPNADVSGLLEDDGNE 
VAPNGQLTLTTPIHNYKA 


6191 


1212 


1511 


VNLCHGGLLHLSTHHLGIKPSMH*LFFLMLSFPHLTPQQPKCPS 
KIDWIKKIWYI YTME YYATI KRNE IMFFAGTWMEMEAI I LS KLM 
QDYMFSLISGS 


6192 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEAG IEAVGSAAEE 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDEIKIPPEPPG 
RCSNHLQDKIQKLYERKI KEGMDMNY I IQRKKEFRNPS I YEKL I 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALiAKAQKIEMDKLEK 
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location 
corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, rc=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine f X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AKKERTKIEFVTCTKKGTTTNATSTTTTTASTAVADAQKRKSKW 
DSAI PVTT I AQPT I LTTTATLPAWTVTTSASGS KTTVI SAVGT 
IVKKAKQ 


6193 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEAGIEAVGSAAEE " 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDBNSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDEIKIPPEPPG 
RCSNHIX^DKIOKLYBRXTKEGMDM^TTnRTfTTiriPPMDCTVwirT t 

V'wii uiu\i i\.i-iTiJl ii*/l'JL<i i J. iynn n n. n £sJMr& Ji X JLfVJLlJL 

QFCAIDELGTNYPKDMFDPHGWSEDSxTEALAKAQKIEMDKLEK 
AKKE RTKI E FVTGTKKGTTTNATS TTTTTAS TAVADAQ KR KS KW 
DSAI PVTTIAQPTI LTTTATLPAWTVTTSASGS KTTVI SAVGT 
IVKKAKQ 


6194 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEAGIEAVGSAAEE 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDEIKIPPEPPG 
RCSNHLQDKIQKLYERKIKEGMDMNYIIQRKKEFRNPSIYEKLI 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
AKKERTKI EFVTGTKKGTTTNATSTTTTTASTAVADAQKRKS KW 
DSAI P VTTI AO PT TLTTTATT i P A WTVTTC 21 Qtl C VTTT / t o a i rn t 
IVKKAKQ 


6195 


736 


235 


VANGLQSNMPKFYCDYCDTYLTHDSPSVRKTHCSGRKHKENVKD 
YYQKWMEEQAQSLIDKTTAAFQQGKIPPTPFSAPPPAGAMIPPP 
PSLPGPPRPGMMPAPHMGGPPMMPMMGPPPPGMMPVGPAPGMRP 
PMGGHMP^PGPPMMRPPARPMMVPTRPGMTRPDR 


6196 


1512 


623 


KTGKRRSAAYVRNILDNAEQVISNLEARNLGPRLTPLLQEEDSH 
QRLLMGLMVS E L KDHFT iT? HT /V3\7P WKX VHMWT tyvtcvt rmTn 

HIVETNWKHNLHSWVLHFNSRGSAAEFAVFHIMTRILEATNSL 
FLPLPPGFHTLHTILGVQCLPLHNLLHCIDSGVLLLTETAVIRL 
MKDLDNTEKNEKLKFSIIVRLPPLIGQKICRLWDHPMSSNIISR 
NHVTRLLQtTi'KKQPRNSMlNKSSFSVEFLPLNYFIEILTDIESS 
NQ AL Y P FE GHDNVDAE F VE E AALKHTAMLLG L 


6197 


3 


819 


ADPEGTEEAVMS RYTRPPNTS LFIRNVADATRPEDLRREFGRYG 
P I VDVYI PLDFYTRRPRG FAYVQ FED VRDAEDAL YNLNRKWVCG 
RQIEIQFAQGDRKTPGQMKSKERHPCSPSDHRRSRSPSQRRTRS 
RS S S WGRNRRRS DS LKESRHRRFS YS Q S KSRSKS L PRRS TS ARQ 
SRTPRRNFGSRGRSRSKSLQKRSKSIGKSQSSSPQKQTSSGTKS 
RSHGRHSDSIARSPCKSPKGYTNFETKVQTAKHSHFRSHSRSRS 
YRHKNSW 


6198 


111 


1912 


SEAALSPS FI SPACFLLRKLPALEDGTLPHPDTLGMNYEGARSE 
RENHAADDSEGGALDMCP^ ER TiPOTiPOP T VMT? ZiT ,ni? zvppt j^nc n 

REMPPPPPPSPPSDPAQKPPPRGAGSHSLTVRSSLCLFAASQFL 
LACGVLWFS G YGH I WS QNATNLVS S LLTLLKQ LE PTAWLDSGT W 
GVPSLLLVFLSGGLVLVTTLVWHLLRTPPEPPTPLPPEDRRQSV 
S RQ PS FT YS E WMEE K I EDDFLDLDP VPE TP VFDCVMD I KPEAD P 
TSLTVKSMGLQERRGSNVSLTLDMCTPGCNEEGFGYLMSPREES 
ARE YLLS AS R VLQAEELHEKALD P FLLQ AE FFEI PMN F VD PKE Y 
DIPGLVRKNRYKTILPNPHSRVCLTSPDPDDPLSSYINANYIRG 
YGGEEKVY IATQGP I VSTVAD F WRMVWQEHTP 1 1 VM I TN I EEMN 
EKCTEYWPEEQVAYDGVEITVQKVIHTEDYRLRLISLKSGTEER 
GLKHYWFTSWPDQKTPDRAPPLLHLVREVEEAAQQEGPHCAPI I 
VHCSAGIGRTGCF I ATS I CCQQLRQEGWD I LKTTCQLRQDRGG 
MIQHCEQYQFVHHVMSLYEKQLSHQSPE 


6199 


144 


1211 


MAR ENGES S SS WKKQAED I KK I FEFKETLGTGAFSEWLAEEKA 
TGKLFAVKCIPKKALKGKESSIENEIAVLRKIKHENIVALEDIY 
ESPNHLYLVMQLVSGGELFDRIVEKGFYTEKDASTLIRQVLDAV 
YYLHRMGIVHRDLKPENLLYYSQDEESKIMISDFGLSKMEGKGD 
VMSTACX3TPGYVAPEVLAQKPYSKAVDCWSIGVIAYILLCGYPP 
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amino acid 
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location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A« Alanine . CaOvst" pin^ n.^Scna vt- i /-> a^i ^ t? 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=>Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y«=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
X^possible nucleotide insertion) 








FYDENDS KLFEO I LKAE YE FD «? P YW nr> T <? n «; a Trn p t j?wt"mi? v h n 
NKRYTCEQAARHPWIAGDTALNKNIHESVSAQIRKNFAKSKWRQ 

AFNATAWRHMRKLHLGSSLDSSNASVSSSLSLASQKDC7VSGTF 
HAL* 


6200 


702 


96 


LPEVPHSLRPRVKPHLCriAOPAVWVMZiRT.DVT.avpnT nvrT mtv5 — 
W VDTHVD P P FHKS S DGT VRDRRGQDVRL Y P E VPE VLKRLQ S LG V 
PGAAASRTSEIEGANQLLELFDLFRYFVHREIYPGSKITHFERL 
QQKTGIPFSQMIFFDDERRNIVDVSKLGVTCIHIQNGMNLQTLS 
QGLETFAKAQTGPLRSSLEESPFEA 


6201 


2809 


23B3 


GQTPRVRWKMRRS LRAGKRRQTAGRKSKS PPKVP I VI QDDSLPA - 
G P P PQI R I LKRPT SNG WS S PNS TS RPTL P VKSLAQRE AE YAEA 
RKRILGSASPEEEQEKPILDRPTRISQPEDSRQPNNVIRQPLGP 
DGSQGFKQRR 


6202 


2 


426 


INADRAAVASSLLSRPTRKMAPQKDRKPKRSTWRFNLDLTHPVE" 
DG I FDSGNFEQFLRE KVKVNGKTGNLGNWHI ERFKNKI TWS E 
KQFSKRYLKYLTKKYLKKNNLRDWLRWASDKETYELRYFQISQ 
DEDSSESED 


6203 


419 


2550 


RCPR P PATAGAAASR PDRSPPSGISGS EAAAGAGAAAPAS QH PA 
TGTGAVQTEAMKQILGVIDKKLRNLEKKKGKLDDYQERMNKGER 
LNQDQLDAVSKYQEVTNNLEFAKELQRSFMALSQDIQKTIKKTA 
RREQLMREEAEQKRLKTVLELQYVLDKLGDDEVRTDLKQGLNGV 
r j. uo a£>£.Jjo liLtUtL r i Kij VDPE RDMS bRLNEQ YEHAS I HLW DLLE 
GKEKPVCGTTYKVLKEIVERVFQSNYFDSTHNHQNGLCEEEEAA 
SAPAVEDQVPEAEPEPAEEYTEQSEVESTEYVNRQFMAETQFTS 
GEKEQVDEWTVETVEWNSLQQQPQAASPSVPEPHSLTPVAQAD 
PLVRRQRVQDLMAQMQGPYNFIQDSMLDFENQTLDPAIVSAQPM 
N PTQNMDMPQLVCPP VHS ES RLAQPNQ VP VQPEATQVP L VS S TS 
EG YTASQPLYQPSHATEQRPQKEP I DQIQATI SLNTDQTTAS SS 
LPAASQPQVFQAGTSKPLHSSGINVNAAPFQSMQTVFNMNAPVP 
P VNEPETLKQQNQ YQAS YNQSFS S QPHQVEQTELQQEQLQTWG 
TYHGSPDQSHQVTGNHQQPPQQNTGFPRSNQPYYNSRGVSRGGS 
RGARGLMNGYRGPANGFRGGYDGYRPSFSNTPNSGYTQSQFSAP 
RDYSGYQRDGYQQNFKRGSGQSGPRGAPRGRGGPPRPNRGMPQM 
NTQQVN 


6204 


2933 


787 


CTHNLISLLGGRALIHFNRFLNLKIQEGEAHNIFCPAYDCFQLV 
PGD 1 1 KS WS KEMDKRYLQFD IKAFVENNPAI KWC PTPGCDRAV 
RLTKQGSNTSGSDTLSFPLLRAPAVDCGKGHLFCWECLGEAHEP 
CDCQTWKNWLQK I TE MKP EELVGVS EAYED AANCLWL LTNS K PC 

YRCTRYEVIQHVEEQSKEMTVEAEKKHKRFQELDRFMHYYTRFK 
NHEHSYQLEQRLLKTAKEKMEQLSRALKETEGGCPDTTFIEDAV 
HVLLKTRRI LKCS Y P YG FFLE P KSTKKE I FELMQTDLEMVTEDL 
AQKVNRP YLRTPRHKI IKAACLVQQKRQE FLAS VARGVAPADS P 
EAPRRSFAGGTWDWEYLGFASPEEYAEFQYRRRHRQRRRGDVHS 
LLSNPPDPDEPSESTLDIPEGGSSSRRPGTSWSSASMSVLHSS 
S LRD YTPAS RS ENQ D S LQ ALS S LDEDD PN ILLAI QLS LQES GLA 
LDEETRDFLSNEASLGAIGTSL PS RLDSVPRNTDS PRAALS SS E 
LLELGDS LMRLGAENDP FS TDTLS SH PLSE ARSDFCP S SSDPD S 
AGQDPNINDNLLGNIMAWFHDMNPQS IALIPPATTEISADSQLP 
C I FCDGSEGVKDVELVLPEDSMFEDASVSEGRGTQ I EENPLEEN I 
PGGGKQHPQAW 


6205 


1 


1200 


RAHRGKMALEVGDMEDGQLSDSDSDMTVAPSDRPLQLPKVLGGD 
SAMRAFQNTATACAPVSHYRAVES VDS SEES FSDS DDDS CLWKR 
KRQKC FNP P PKP E P FQ FGQS S QKP P VAGG KKI NN I WGAVLQEQN 
QDAVATELGILGMEGTIDRSRQSETYNYLLAKKLRKESQEHTKD 
LD KELDE YMHGGKKMGS KEE ENGQGHLKR KR P VKDRLGNRPEMN 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V« Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








YKGRYEITAEDSQEKVADEISFRLQEPKKDLXARVVRIIGNKKA 
IELLMETAEVEQNGGLFIMNGSRRRTPGGVFLNLLKNTPSISEE 
Q I KD I F Y I ENQKE YENKKAARKRRTQVLG KKMKQAI KS LNFQE D 
DDTSRET FASDTNEALAS LDE SQEGHAE AKLEAE EA I EVDHSHD 
LDIF 


620* 


10 


1442 


IISERRERSCLHLVCIRCSCDVVEMGSVLGLCSMASWIPCLCGS 
APCLLCRCCPSGNNS TVTRL I YALFLLVGVCVACVML I PGMEEQ 
LNKI PG FCENEKG WP CN I LVG Y KAVYRLC FGLAMF YLLLS LLM 
IKVTCSSSDPRAAVHNGFWFFKFAAAIAIIIGAFFIPEGTFTTVW 
FYVGMAGAFCFILIQLVLLIDFAHSWNESWVEKMEEGNSRCWYA 
ALLSATALNYLLSLVAIVLFFVYYTHPASCSENKAFISVNMLLC 
VGASVMSILPKIQESQPRSGLLQSSVITVYTMYLTWSAMTNEPE 
TNCNPS LLS 1 1 G YNTTSTVP KEGQS VQ WWHAQG 1 1 G LI LFLLC V 
FYSSIRTSNNSQVNKLTLTSDESTLIEDGGARSDGSLEDGDDVH 
RAVDNERDG VTYS YS F FHFML FLAS L YI MMTLTNW YR YEPS REM 
KSQWTAVWVKISSSWIGIVLYVWTLVAPLVLTNRDFD 


6207 


2924 


1471 


T VMAEAAT PGTTATT SGAGAAAATAAAAS PT PI P TVTAPS LGAG 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
SWCKYFQRGYCIYGDRCRYEHSKPLKQEEATATELTTKSSLAA 
S SS LSS I VGPLVEMNTGEAES RNSNFATVGAGSEDWVNAI EFVP 
GQP YCGRTAPS CTEAPLQGS VTKE esekeqtavetkkqlcp yaa 
VGE CRYGENCVYLHGDS CDMCGLQVLHPMDAAQRSQH IKSCIEA 
HE KDME LS FAVQRS KDMVCG I CME WYEKANPSERRFG I LSNCN 
HTYCLKCIRKWRSAKQFESKIIKSCPECRITSNFVIPSEYWVEE 
KEEKQKLILKYKEAMSNKACRYFDEGRGSCPFGGNCFYKHAYPD 
GRREEPQRQKVGTS SR YRAQRRNHFWEL IEERENSNPFDNDEEE 
WTFELGEMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


£208 


2924 


1471 


TVMAEAATP G TTATTS GAGAAAATAAAAS PTP I PTVTAPS LGAG 
GGGGGSDG S GGG WTKQ VTCRY FMHG VCKBGDNCRYS HDLS DS P Y 
SWCKYFQRGYCIYGDRCRYEHSKPLKQEEATATELTTKSSLAA 
SSSLSS IVGPLVEMNTGE AESRNSNFATVGAGS EDWVNAI EFVP 
GQPYCGRTAPS CTEAPLQGSVTKEES EKEQTAVETKKQLCPYAA 
VGECR YGENCVYLHGDS CDMCG LQVLHPMDAAQRSQH I KS CIEA 
HEKDMELS FAVQRS KDMVCG I CME WYEKANPSERRFG I LSNCN 
HTYCLKCIRKWRSAKQFESKIIKSCPECRITSNFVIPSEYWVEE 
KEEKQKLI LKYKEAMSNKACR YFDEGRGS CP FGGNCFYKHAYPD 
GRREE PQRQKVGTSSRYRAQRRNHFWEL IEERENSNPFDNDEEE 
WTFELGEMLLMLLAAGGDDELTDS EDEWDL FHDELEDFYDLDL 


6209 


1758 


829 


ERLCFPCMQS KI YS YMS PNKCSGMRFPLQEENS VTHHE VKCQGK 
PLAG I YRKREEKRNAGNAVRSAMKSEEQKI KDARKGPLVP FPNQ 
KSEAAEPPKTPPSSCDSTNAAIAKQALKKPIKGKQAPRKKAQGK 
TQQNRKLTDFYPVRRSSRKSKAELQSEERKRIDELIESGKEEGM 
K I D L I DGKGRGVI ATKQFS RGDFWE YHGDLI E I TDAKKREAL Y 
AQD PSTGCYM YYFQYLS KT YCVDATRETNRLGRL INHS KCGNCQ 
TKLHD IDGVPHLI LI ASRDI AAGEELL YDYGDRS KAS IEAHPWL 
KH 


6210 


3761 


387 


I FGMS KLRMVLLEDS GS ADFRRHFVNLS PFTI TWLLLSACFVT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
S V I CNQLGCP TA I KAPGWANS SAGS GR I WMDHVS CRGNESALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNLEMRLTRGGNMCSGR I E 
IKFQGRWGTVCDDNFNIDHASVICRQLECGSAVSFSGSSNFGEG 
SGPIWFDDLICNGNESALWNCKHQGWGKHNCDHAEDAGVICSKG 
ADL SLRL VDG VTE CSGRLE VR FQGE WGTI CDDG WDS YDAAVACK 
QLGCPTAVTAIGRVNASKGFGH I WLDS VS CQGHEPAVWQCKHHE 
WGKHYCNHNEDAGVTCSDGSDLELRIiRGGGSRCAGTVEVE IQRL 
LGKVCDRGWGLKEAD WCRQLG CGS ALKTS YQVYS K I QATNTWL 
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amino acid 
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Predicted end 
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location 
corresponding 
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amino acid 
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Amino acid segment containing signal peptide"" 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptpphan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FLSSCNGNETSLWDCKNWQWGGLTCDHYEEAKITCSAHREPRLV 
GGDI PCSGR VEVKHGDTWGS I CDSDFSLEAASVLCRELQCGTW 
SILGGAHFGEGNGQIWAEEFQCEGHESHLSLCPVAPRPEGTCSH 
SRDVG WCS RYTE I RLVNG KT P CEGRVE L KTLGAWGS L CNSH WD 
IEDAHVLCQQLKCGVALSTPGGARFGKGNGQIWRHMFHCTGTEQ 
HMGDCP VTALGAS LCPS EQVAS VI CSGNQSQTLSS CNSS SLG PT 
RPTIPEESAVACI ESGQLRLVNGGGRCAGRVEI YHEGSWGTI CD 
DSWDLSDAHWCRQLGCGEAINATGSAHFGEGTGPIWLDEMKCN 
GKES R I WQCHS HG WGQQNCRH KEDAG VI CS E FMS LRLTS E AS RE 
ACAGRLEVFYNGAWGTVGKSSMSETTVGWCRQLGCADKGKINP 
ASLDKAMSIPMWVDNVQCPKGPDTLWQCPSSPWBKRLASPSEET 
WITCDNKIRLQEGPTSCSGRVEIWHGGSWGTVCDDSWDLDDAQV 
VCQQLGCGPALKAFKEAEFGQGTGPIWLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDISVQKTPQKATTGRSSRQSSFIA 
VG I LGWLLA I FVAL FFLTKKRRQRQRLAVS S RGENL VHQ IQ YR 
EMNS CLNADDLDLMNS S GGHS E PH 


6211 

• 


3761 


387 


I FGMS KLRM VL LEDSG S AD FRRHF VNLS P FT I TWLLLS ACF VT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
S VI CNQLG C PTAI KAPGWANS SAG SGRI WMDHVS CRGNE S ALWD 
CKHDG WGKHS WCTHQQDAG VTCS DGSNLEMRLTRGGNM CS GR I E 
I KFQGRWGTVCDDNFNI DHAS V I CRQLECG S AVS FSGSSNFGEG 
SG P I WFDDL I CNGNES ALWNCKHQG WGKHNCDHAEDAGV I CS KG 
ADLSLRLVDGVTECSGRLEVRFQGEWGTICDDGWDSYDAAVACK 
QLGCPTAVTAI GRVNAS KG FGH I WLDSVSCQGHEPAVWQCKHHE 
WGKHYCNHNEDAGVTCSDGSDLELRLRGGGSRCAGTVEVEIQRL 
LGKVCDRGWGLKEADWCRQLGCGSALKTSYQVYSKIQATNTWL 
FLSSCNGNETSLWDCKNWQWGGLTCDHYEEAKITCSAHREPRLV 
GGDI PCSGRVEVKHGDTWGS I CDSDFSLEAASVLCRELQCGTW 
SILGGAHFGEGNGQIWAEEFQCEGHESHLSLCPVAPRPEGTCSH 
SRDVGWCS RYTE I RLVNG KTPCEGRVELKTLGAWGSLCNSHWD 
IEDAHVLCQQLKCGVALSTPGGARFGKGNGQIWRHMFHCTGTEQ 
HMGDCPVTALGAS LCPSEQVAS VI CSGNQSQTLS S CNS S SLGPT 
RPTI PEESAVACIESGQLRLVNGGGRCAGRVE I YHEGSWGTI CD 
DS WDLSDAHWCRQLGCGEAINATGS AHFGEGTGP I WLDEMKCN 
GKE S R I WQCHS HG WGQQNCRHKEDAG VI CS E FMS LRLTS EASRE 
ACAGRLEVFYNGAWGTVGKSSMSETTVGWCRQLGCADKGKINP 
ASLD KAMS I P M WVDNVQ CP KG PDTLWQC PS S PWE KRIAS PS EET 
W I TCDNKI RLQEGP TS CSGRVE I WHGGS WGTVCDDS WDLDDAQ V 
VCQQLGCGPALKAFKEAEFGQGTG P I WLNE VKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDI S VQKTPQKATTGRS SRQSSFIA 
VGILGWLLAIFVALFFLTKKRRQRQRLAVSSRGENLVHQIQYR 
EMNS CLNADDLDLMNSSGGHSEPH 


6212 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRS CCCQTNPGPP SSLRRAFRR 
RELPFPACHEIGLGAEAGSGPPPAPAARESRSRAMEEEASSPGL 
\j\-ox\.irriijCirvLi l J. iKXJbJSooJruVIBVi ilbKPPAERHMISSWE 
QKNNCVMPEDVKNF YLMTNGFHMTWS VKLDEH 1 1 PLGSMAI NS I 
SKLTQLTQSSMYSLPNAPTLADLEDDTHEASDDQPEKPHFDSRS 
VI FELDS CNGSGKVCLVYKSGKPALAEDTE I WFLDRALYWHFLT 
DTFTAYYRLLITHLGL PQWQYAFTS YGIS PQAKQRVSMYKP IT Y 
NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6213 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
RELPF PACHE IGLGAEAGSG P P PAP AARESRS RAME E EAS S PGL 
GCSKPHLEKLTLGITRILESSPGVTEVTIIEKPPAERHMISSWE 
QKNNCVMPEDVKNFYLMTNGFHMTWSVKLDEHIIPLGSMAINS1 
SKLTQLTQSSMYSLPNAPTLADLEDDTHEASDDQPEKPHFDSRS 
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Amino acid segment containing signal peptide 
(A=Alanine f C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P» Phenyl alanine, G=Glycine, 
HoHistidine, I-Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VIFELDSCNGSGKVCLVYKSGKPALAEDTEIWFIiDRALYWHFLT 
DTFTAYYRLL ITHLGLPQWQYAFTS YGIS PQAKQRVS MYKPITY 
NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6214 


2 


460 


HELAPS AI RRAARLGLGP ARWQSRAAAFYFVRGFRTGWS FVGWV 
VLGTSAKRTRLFFFLSKMAASSRAQVLALYRAMLRESKRFSAYN 
YRTYAVRR I RDAFRENKNVKDP VE I QTLVNKAKRDLGVI RRQVH 
IGQIjYSTDKLI I ENRDMPRT 


6215 


2 


1849 


FVAGG PRGS GSAAETMPE I RVT PLGAGQDVGRS C I LVS I AGKNV 

MTinr*RMHMnPNTir)l?P'PPr>TrcVT r PO"Mr2DT. , T , nT?T r^m/TTotJTTTTr r>r r 
viuLj^\jyi nrivj r nuLJC\S\r cus oil i^/iMijtCLj iu,c v X X orlr xiLiEH 

CGALPYFSEMVGYDGPIYMTHPTQAICPILLEDYRKIAVDKKGE 
ANFFTSQM I KDCMKKWAVHLHQTVQVDDELE I KAYYAGHVLGA 

YATTIRDSKRCRERDFLKKVHETVERGGKVLIPVFALGRAQELC 
ILLETFWERMNLKVPIYFSTGLTEKANHYYKLFIPWTNQKIRKT 
FVQRNMFEFKHIKAFDRAFADNPGPMWFATPGMLHAGQSLQIF 
RKWAGNEKNMVIMPGYCVQGTVGHKILSGQRKLEMEGRQVLEVK 
MQVEYMSFSAHADAKGIMQLVGQAEPESVLLVHGEAKKMEFLKQ 
KIEQELRVNCYMPANGETVTLPTSPSIPVGISLGLIiKREMAQGL 
LPEAKKPRLLHGTLI MKDSNFRLVS SEQALKELGLAEHQLRFTC 
R VHLHDTRKEQE TALRVYS H LKS VLKDHCVQHLPDGS VTVE S VL 
LQAAAPS EDPGTKVLLVS WT YQDEELGS FLTSLLKKGLPQAPS 


6216 


11 


393 


QTTRPEPRNSALRQSRSKMAWGVSSVSRLLGRSRPQLGRPMSS 
GAHGE EGS ARMW KTLTFFVALPGVAVSMLNVYLKSHHGEHERPE 
FIAYPHLRIRTKP FP WGDGNHTLFHNPHVNPLPTG YE DE 


6217 


9 


1178 


TR VGRGE S GLKM E VKPP PGRPQ PDSGRRRRRRGE EGHDP KE P EQ 
LRKLF IGGLS FETTDDSLREHFEKWGTLTDCWMRDP QTKRSRG 
FGFVTYSCVEEVDAAMCARPHKVDGRVVEPKRAVSREDSVKPGA 
HLTVKK I FVGG I KEDTE E YNLRD YFE KYGKI ETI E VMEDRQSGK 
KRG FAFVTFDDHDTVDK I WQ KYHT INGHNC E VKKALS KQEMQ S 
AGSQRGRGGGSGNFMGRGGNFGGGGGNFGRGGNFGGRGGYGGGG 
GGSRG S YGGGDGG YNGFGGDGGNYGGG PG YS SRGG YGGGGPGYG 
NQGGGYGGGGGYDGYNEGGNFGGGNYGGGGNYNDFGNYSGQQQS 
NYGPMKGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 


6Z18 


1305 


906 


S CERRG F IMADDLKRFLYKKLPS VEGLHA I WSDRDG VP VI KVA 
NDNAPEHALRPGFLSTFALATDQG9KLGLSKNKSIICYYNTYQV 
VQFNRLP L WS F I AS S S ANTGL I VSLE KELAPL FEELRQ WE VS 


6219 


2 


890 


AGPGEGAGAGTRCAGAEAEMAS AGGEDCES PAPEADR PHQR P FL 
IGVSGGTASGKS TVCEKIMELLGQNEVEQRQRKWI LSQDR F YK 
VLTAE QKAKALKG Q YNFDH PDAFDNDLMHRTLKN I VE GKTVE VP 
TYDFVTHSRLPETTWYPADWLFEGILVFYSQEIRDMFHLRLF 
VDTDSDVRIiSRRVLRDVRRGRDLEQIIiTQYTTFVKPAFEEFCLP 
T KKYADV 1 1 PRG VDNM VAI NL I VQH IQD I LNGD I CKWHRGGSNG 
RSYKRTFSEPGDHPGMLTSGKRSHLESSSRPH 


6220 


227 


764 


EQNI S LE MS CTI E KALAD AKAL VERLRDHDDAAE SL I E QTTALN 
KRVEAMKQYQEE IQELNEVARHRPRSTLVMGI QQENRQIRELQQ 
ENKELRTSLEEHQSALELIMSKYREQMFRLLMASKKDDPGIIMK 
LKEQHSKIDMVHRNKSEGFFLDASRHILEAPQHGLERRHLEANQ 
NVH 


6221 


98 


916 


RWI WDIjNPVS DGLELRPKYNGI LHCLTTI WKLDGLRGLYQGVTP 
NI WGAGLS WGLY FVFYNAI KS YKTEGRAERLEATE YLVS AAEAG 
AMTLC ITNPLWVTKTRLMLQ YDAWNS PHRQYKGMFDTLVKI YK 
YEGVRGLYKGFVPGLFGTSHGALQFMAYELLKLKYNQHINRLPE 
AQ L S TVS Y I S VAALS K I FAVAAT Y P YQ WRARLQDQHMF YSG V I 
DVITKTWRKEGVGGFYKGIAPNLIRVTPACCITFWYENVSHFL 
LDLREKRK 
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P=Proline, Q=Glutamine, R=Arginine, 
S =Ser ine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
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6222 


2 


2116 


MARELRALLLWGRRLRPLLRAPALAAVPGGKP I LCPRRTTAQLG 
PRRNPAWSLQAGRLFSTQTAEDKEEPLHS I ISSTESVQGSTSKH 
EFQAETKKLLDIVARSLYSEKEVFIRELISNASDALEKLRHKLV 
o LJIjU/UjF bMfc. 1 HUJ 1 NAh KGT I T I QD I G I GMTQEELVSNLGTI A 
RS GS KAFLDALQNQ AEASS KI I GQFGVGF YS AFMVADR VE VYS R 
SAAPGSLGYQWLSDGSGVFEIAEASGVRTGTKIIIHLKSDCKEF 
S S E ARVRD WTKYS NF VS F P L YLNGRRMNTLQA I WMMD PKDVRE 
WQHEEFYRYVAQAHDKPRYTLHYKTDAPLWIRSIFYVPDMKPSM 
FDVSRELGSSVALYSRKVLIQTKATDILPKWLRFIRGWDSEDI 
FijJM LioReliLiQES AL I RKIjRD VLQQRX) I KF F I DQSKKDAEKYAKF 
r liL*! oJjrTlKbtjl V lAibybVJ^DlAlUjJjKYESSAIjPSGQIiTSLS 
E YAS RMRAGTRN I Y YLCAPNRHLAEHS P YYEAMKKKDTEVL FC F 
EQFDELTLLHLREFDKKKLISVETDIWDHYKEEKFEDRSPAAE 
CLS E KE TEEIJ^WMROTI/SS R VTNVKVTLRLDTHPA^^\^^ , VLEMG 
AARH FLRMQQLAKTQEERAQLLQPTLE I NPRHAL I KKLNQLRAS 
E PGLAQLLVDQI Y ENAM I AAG L VDDPRAMVGRLNELLVKALE RH 


6223 


3 


715 


DAWARTMAGMVDFQ D E EQVKS FLENME VE CN YHC YHEKD PDG C Y 
RLVDYLEGIRKNFDEAAKVLKFNCEENQHSDSCYKLGAYYVTGK 
GGLTQDLKAAARCFLMACEKPGKKS I AACHNVGLLAHDGQVNED 
GQ PDLG KARD YYTRACDGG YTS S C FNLS AMFLQGAPGF PKDMDL 
ACKYSMKACDLGHIWACANASRMYKU3DGVDKVEAKAEVLKNRA 
QQVHKEQQKGVQPLTFG 


6224 


1 


133 


LRTI SSMAWGPLLLTLLAHCTGS WAQS VLTQP PS VSGAR I PHEK 


6225 


3259 


938 


LLS CHRLA I CKLP FS VE S RKTVMG PQGARRQAFLiAFGD VTVD FT 
QKEWRLLS PAQRAL YR E VTLENYSHL VS LG I LHSKP5L I RRLEQ 
GEVPWGEERRRRPGPCAGIYAEHVLRPKNLGLAHQRQQQLQFSD 
QSFQSDTAEGQEKEKSTKPMAFSSPPLRHAVSSRRRNSWEIES 
SQGQRENPTEIDKVLKGIENSRWGAFKCAERGQDFSRKMMVI IH 
KKAHS RQKL FTCRE CHQG FRDE S ALLLHQNTHTG EKS YVCS VCG 
RGFS LKANLLRHQRTHSGE KP FLCKVCGRGYTS KS YLTVHERTH 
TGEKPYECQECGRRFNDKSSYNKHLKAHSGEKPFVCKECGRGYT 
NKS YF WHKR IHSGEKP YRCQE CGRGFSNKSttLITHQRTHSGEK 
P FACRQCKQS FS VKGS LLRHQRTHSGEKPFVC KDCERS FS Q KST 
LVYHQRTHSGEKPFVCRECGQGFIQKSTLVKHQITHSEEKPFVC 
KDCGRGFIQKSTFTLHQRTHS E E KP YGCRECGRR FRDKS S YNKH 
LRAHLGEKRFFCRDCGRGFTLKPNLTIHQRTHSGEKPFMCKQCE 
KS FS L KANLLRHQ WTHSGERPFNCKD CGRG F I LKS TLLFHQKTH 
SGEKPFICSE CGQGF I WKS NLVKHQLAHSG KQ P FVCKE CGRGFN 
WKGNLLTHQRTHSGEKPFVCNVCGQGFSWKRSLTRHHWRIHSKE 
KP F VCQECKRG YTS KS DLT VHER I HTGERP YECQE CGR KFSNKS 
YYS KHIjKRHLREKRFCTGS VGEASS 


6226 


29 


266 


TKVS EIiLGGSQRLFFLPLWRRLCRCGLGPRVS PMAGPRVE VDGS 
IMEGGGQSLRVSTGLSWLLSLPWRAQRIRAGRSYA 


6227 


2581 


890 


ms as s IjIjEOR p kgognkvono *? VHrnf no t .Mnr>T") PP P YT . ^ pna T? d 
NNAYTAMSDS YL PS YYS P S I G FS YS LGEAAWSTGGDTAMP YLTS 
YGQLSNGEPHFLPDAMFGQPGALGS TPFLGQHGFNFFPSG IDFS 
AWGNNSSQGQS TQS SGYSSN YAYAPS SLGGAM IDGQSAFANETL 
NKAPGMNTIDQGMAALKLGSTEVASNVPKVVGSAVGSGS ITSNI 
VASNSLPPATIAPPKPASWADIASKPAKQQPKLKTKNGIAGSSL 
PP P P I KHNMD I GTWDNKG P VAKAPS QAL VQNIGQ P TQGS PQPVG 
QQANNS P PVAQAS VGQQTQPLPPPPPQPAQLS VQQQAAQPTRWV 
APRNRG SGFGHNG VDGNGVGQS QAGS GS TPS E PHP VLEKLRS IN 
NYNP KD FD WNLKHGRVF I 1 KS YSEDD I HRS I KYN I WCSTEHGNK 
RLDAAYRSMNGKGPVYLLFSVNGSGHFCGVAEMKSAVDYNTCAG 
VWSQDKWKGRFDVRWIFVKDVPNSQLRHIRLENNENKPVTNSRD 
TQEVPLEKAKQVLKI IASYKHTTSIFDDFSHYEKRQ 
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Amino acid segment containing signal peptide 
<A»Alanine, C=Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6228 


47 


1978 


GRRCRRRGAVMELAQELARELGCWAVEEMGVPVAARAPESTLRRL 
CLGQGADIWAYILQHVHSQRTVKKIRGNLLWYGHQDSPQVRRKL 
ELEAAVTRLRAE IQELDQSLELMERDTEAQDTAMEQARQHTQDT 
QRRALLLRAQAG AMRRQQHTLRD PMQRLQNQLRRLQDMERKAKV 
DVT FGS LTS AALGL E P WLRD VRTACTLRAQ FLQNLLL PQAKRG 
SLPTPHDDHFGTSYQQWLSSVETLLTNHPPGHVLAALEHLAAER 
EAEIRSLCSGDGLGDTEISRPQAPDQSDSSQTLPSMVHLIQEGW 
RTVGVLVSQRSTLLKERQVLTQRLCX3LVEEVERRVLGSSERQVL 
ILGLRRCCLWTELKALHDQSQELQDAAGHRQLLLRELQAKQQRI 
LHWRQLVEETQEQVRLLI KGNSAS KTRLCRS PGEVLALVQRKW 
PTFEAVAPQSRELLRCLEEEVRHLPHILLGTLLRHRPGELKPLP 
TVLPSIHQLHPASPRGSSFIALSHKLGLPPGKASELLLPAAASL 
RQDLLLLQDQRSLWCWDLLHMKTSLPPGLPTQELLQIQASQEKQ 
QKENLGQALKRLEKLLKQALERIPELQGIVGDWWEQPGQAALSE 
E LCQGLSL P QWRLR WVQAQG ALQKLCS 


6229 


1571 


560 


GPSLLGTRGTPNPARTLQIFFLIIGRRLTGRMAAVDDLQFEEFG 
NAATSLTANPDATTVNIEDPGETPKHQPGSPRGSGREEDDELLG 
NDDSDKTELLAGQKKSS PFWTPE YYQTFFDVDT YQVFDR I KGSL 
LP I PGKNFVRLYIRSNPDLYGPFWICATLVFAIAISGNLSNFLI 
HLGE KTYH YVP E FRKVS I AAT 1 1 YAYAWL V P LALWG FLM WRNS K 
VMNIVSYSFLEIVCVYGYSLFIYIPTAILWIIPHKAVRWILVMI 
ALG I SGS LLAM TFW P AVREDNRR VALAT I VT I VLLHMLLS VGCL 
AY F FDAP EMDH LPTTTAT PNQTVAAAKS S 


6230 


1723 


600 


S KMSGRSGKKKMSKLSRSARAG V I FP VGRLMR YLKKGTFKYRI S 
VGAPVYMAAVT EYLAAE I LELAGNAARDNKKAR IAPRHI LLAVA 
NDEELNQLLKGVTIASGGVLPRIHPELLAKKRGTKGKSETILSP 
PPEKRGRKATSGKKGGKKSKAAKPRTSKKSKPKDSDKEGTSNST 
SEDGPGEX3FTI LSSKSLVLGQKLSLTQSD I SHIGSMRVEGI VHP 
TTAEIDLKEDIGKALEKAGGKEFLETVKELRKSQGPLEVAEAAV 
SQS SGLAAKFVI HCH I PQWGSDKCEEQLEET I KNCLS AAEDKKL 
KSVAFPPFPSGRNCFPKQTAAQVTLKAISAHFDDSSASSLKNVY 
FLL FDS ES 1 G I YVQE MAKLDAK 


6231 


149 


870 


L I FS S S TMDRS LRNVL WS FGFLLL FTAYGGLQ S LQS S L Y S EEG 
LGVTALSTLYGGMLLS SMFLPPLL I ERLGCKGT 1 1 LSMCG YVAF 
SVGNFFASWYTLIPTSILLGLGAAPLWSAQCTYLTITGNTHAEK 
AGKRGKDMVNQYFGIFFLI FQSSGVWGNLISS LVFGQTPSQETL 
PEEQLTSCGASDCLMATTTTNSTORPSQQLVYTLLGIYTGSGVL 
AVLM I AAFLQP IRDVQRE SE 


6232 


3679 


1476 


F VAGTTMAGFWVGTAPLVAAGRRGRW P PQQLM LS AALRTLKHVL 
YYSRQCLMVSRNI/3SVGYDPNEKTFDKILVANRGEIACRVIRTC 
KKMG I KTVAI HSDVDAS S VHVKMADEAVCVG PAPTS KSYLNMDA 
IMEAI KKTRAQAVHPGYG FLSENKE FARCLAAEDWFIGPDTHA 
I QAMGDKI ES KLLAKKAEVNTIPGFDGWKDAEEAVRIARE I G Y 
P VM I KAS AGGGGKGMR I AWDDEETRDGFRLS SQEAAS S FGDDRL 
L IE KF I DNPRH I E I Q VLGD KHGNALWIiNERECS I Q RRNQKWE E 
APS I F L DAETRRAMG EQA VALARAVKYS SAGTVE FLVDS KKNFY 
FLEMNTRLQVEHP VTEC I TGLDLVQEMI RVAKG YPLRHKQAD I R 
INGWAVECRVYAEDPYKS FGLPS IGRLSQYQEPLHLPGVRVDSG 
I QPGS D I S I Y YDPM I S KL I T YGSDRTE ALKRMADALDNYV I RG V 
THN I ALLREVI I NSR FVKGD I S TKFLS DVY PDG FKGHMLT KS E K 
NQLLAI ASSLFVAFQLRAQHFQENSRMPVI KPD IANWELS VKLH 
DKVHTWASNNGS VFS VEVDGS KLNVTSTWNLAS PLLSVSVDGT 
QRTVQCLS REAGGNM S IQ FLGT VYKVN I LTRLAAE LNKFMLE KV 
TEDTS S VLRS P M PG VWAVS VKPGDAVAEGQE I C VI EAMKMQNS 
M TAGKTG TVKS VHCQ AGD TVG EG DL L VE LE 


6233 


1 


2654 


HS TRENLNAGNFNFPS EGHLVRSTGPGGS FAKHMVAQCVSP KGP 
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Amino acid segment containing signal peptide 
{A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
I*=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W^Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








LACSRTYFFGATHVPYLGGDSKLPKKTEQIRLLSQI YAAVI EAV 
LAGIACYAKTSSLTKAKEVAEQTLGSGLDSFELIPFKAALRSKM 
TFHIHAVNNQGRI VPLDSEDSLS FVKTACMAVYDI PDLLGGNGC 
LGSWFSESFLTSQILVKEKDGTVTTETSSWLTAAVPRFCSWL 
VEDNEVKLSEKTHQAVRGDESFLGTYLTGGEGAYLYSSNLQSWP 
E EGNVH F FS SGLL FS HCRHG S 1 1 1 S KDHMNS I S FYDGDSTS TVA 
ALLI DFKS SLLPHLPVHFHGSSNFLM IALFPKSKI YQAFYS BVF 
S LWKOODNSGI SLKVI OEDGLS VEOKRLH<? ^ADTfT ,V<Z ST .<?OD2W2 
EKRSSLKLLSAKLPEIiDWFLQHFAISSISQEPVMRTHLPVLLQQ 
AE INTTHR IESDKVIIS I VTGLPGCHAS ELCAFLVTLHKE CGRW 
MVYRQ I MDS SECFHAAHFQRYLS SALEAQQNRSARQS AYIRKKT 
RLLWLQGYTDVI D WQALQTHPDSNVKAS FTIGAI TACVE PMS 
C YMEHR FL F PKCLDQC SQG L VSNWFTS HTTEQRHPLLVQLQ S L 
IRAANPAAAFILAENGIVTRNEDIELILSEKTSFSSPEMLRSRYL 
MYPGWYEGKLNAGSVYPLMVQICVWFGRPLEKTRFVAKCKAIQS 
SIKPSPFSGNXYHILG.KVKFSDSERTMEVCYNTLANSLSIMPVL 
EGPTPPPDSKSVSQDSSGQQECYLVFIGCSLKEDSIKDWLRQSA 
KQKPQRKALKTRGMLTQQE I RS I HVKRHLE P LPAG YFYNGTQ FV 
NFFGDKTDFHPLMDOFMNDYVEE ANT?PT FWVWfWT .EWYFVTJnT.T? 
ELKP 


6234 


1731 


404 


PRVREDMDHKSPGNKGSLVYAGIKSIVKSSLGMVESSRHNWSGL 
DKQSDIQNLNEER I LALQLCGWI KKGTDVDVGPFLNSLVQEGEW 
ERAAAVAIiFNIjD I RRAI Q I LNEGAS S E KGDLNLNVVAMAIiSG Y T 
DEKNSLWREMCSTLRLOLiNNP YLCVMFAFT .T FTf; <? YnaUT ,vpn 
KVAVRDRVAFACKFLS DTQLNR Y I EKLTNEMKEAGNLEG I LLTG 
LTKDGVDLMES YVDRTGDVQTAS YCMLQGS PLDVLKDERVQYW I 
ENYRNLLDAWRFWHKRAEFDIHRSKLDPSSKPLAQVFVSCNFCG 
KSISYSCSAVPHQGRGFSQYGVSGSPTKSKVTSCPGCRKPLPRC 
ALCLINMGTPVSSCPGGTKSDEKVDLSKDKKLAQFNNWFTWCHN 
CRHGGHAGHMLSWFRDHAECPVSACTCKCMQLDTTGNIjVPAETV 

QP 


6235 


1 


571 


EKRDHRLPSWPRAALKVPGRGGRVGTTPELAAGGIMATRNPPPQ 
DYESDDDSYEVLDLTEYARRHQWWNRVFGHSSGPMVEKYSVATQ 
IVMGGVTGWCAGFLFQKVGKLAATAVGGGFLLLQIASHSGYVQI 
DWKRVEKDVNKAKRQIKKRANKAAPEINNLIEEATEFIKQNIVI 
SSGFVGGFLLGLAS 


6236 


1 


703 


WDQNKGAAAGSGLTLPSLPSARFSAGPPTQRSRPTMSNMEKHLF 
NLKFAAKELSRSAKKCDKEEKAEKAKIKKAIQKGNMEVARtHAE 
NAI RQKNQAVNFLRMS ARVDAVAAR VQTAVTMGKVTKSMAG VVK 
SMDATLKTMNLEKISALMDKFEHQFETLDVQTQQMEDTMSSTTT 
LTTPQNQVDMLLQEMADEAGLDLNMELPQGQTGSVGTSVASAEQ 
DBLSQRLARLRDQV 


6237 


312 


720 


PTAMAEEG IAAGGVMDVNTALQEVLKTAL IHDGLARGIREAAKA 
LDKRQAHLCVLASNCDEPMYVKLVEALCAEHQ INLI KVDDNKKL 
GEWVGLCKIDREGKPRKWGCS CWVKDYGKESQAKDVI EE YFK 
CKK 


6238 


2 


4666 


eevptqesvkweinviiknpeivfvadmtIcndapalvittqcei 
cykgnlenstmtaai kdlqvrac pflpvkrkgki ttvlqpcdlf 
yqttqkgtdpqv i dm s vks ltlkvs p vi intm i t i tsal yttke 

TIPEETASSTAHLWEKKDTKTLKMWFLEESNETEKIAPTTELVP 
KGEMIKMNIDS I FI VLEAGIGHRTVPMIiLAKSRFSGEGKETWSSL 
INLHCQLELEVHYYNEMFGVWEPLLEPLEIDQTEDFRPWNLGIK 
MKKKAKMAIVESDPEEENYKVPEYKTVISFHSKDQLNITLSKCG 
L VMLNNL VKAFTE AATG S S ADFVKDLAPFM I LNS LGLT I S VS PS 
DSFSVLNIPMAKSYVLKNGESLS^4DYIRTKDNDHFNAMTSLSSK 
LFFILLTPVNHSTADKI PLTKVGRRLYTVRHRESGVERS IVCQI 
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DTVEGS KKVT I RS P VQ I RNHFS VP L S VYEGDTLIX3TAS P ENE FN 
I PLGSYRS FIFLKPBDENYQMCEGIDFBEI I KNDGALLKKKCRS 
KNPSKESFLINIVPEKDNLTSLSVYSEDGWDLPYIMHLWPPILL 
RNLLPYKIAYYIEGIENSVFTLSEGHSAQICTAQLGKARLHLKL 
LDYLNHDWKS EYHI KPNQQDI S FVS FTCVTEMEKTDLDI AVHMT 
YNTGQTWAPKS PYWMVNKTGRMLQ YKADGIHRKHP PNYKKPVL 
FSFQPNHFFNNNKVQLMVTDSBLSNQFSIDTVGSHGAVKCKGLK 
M D YQ VGVT I DLS S FN I TR I VT FT P F YM I KNKS KYH I S VAEEGND 
K WLS LDLE Q C I P FWP E YAS SKLL I Q VERS E DP P KR I Y FNKQENC 
I LLRLDNE LGG 1 1 AE VNLAEHSTVI TFLD YHDGAAT FLL I NHT K 
NELVQYNQSSLSEIEDSLPPGKAVFYTWADPVGSRRLKWRCRKS 
HGEVTQKDDMMMPIDLGEKTIYLVSFFEGLQRIILFTEDPRVFK 
VTYES EKAELAEQE I AVALQDVG I SLVNNYTKQE VAY IG ITS SD 
VWETKPKKKARWKPMSVKHTEKLEREFKEYTESSPSEDKVIQL 
DTNVP VRLT PTGHNMK I LQ PHVI ALRRN YL PALKVE YNTSAHQ S 
SFRIQIYRIQIQNQIHGAVFPFVFYPVKPPKSVTMDSAPKPFTD 
VS I VMRSAGHS Q I S R I KY F KVL I QEMDLRLDLGF I YALTD LMT E 
AE VTENTEVEL FHKD I E AF KE E YKT ASLVDQSQ VS LYE YFH I S P 
I KLHLS VSLSSGREEAKDS KQNGGL I P VHSLNLLLKS IGATLTD 
VQD WFKLAFFE LN YQ FHTTSDLQS EVIRHYS KQAI KQM YVL I L 
GLD VLGN P FGLI RE F S EG VEAF F YE P YQGAI QG P EE FVEGMALG 
LKALVGGAVGG LAGAAS K I TGAMAKGVAAMTMDED YQQKRREAM 
NKQ PAG FREG I TRGG KGLVSGFVSG I TG I VTKP I KGAQKGGAAG 
F F KG VGKGL VGAVARP TGG 1 1 DMAS STFQG I KRATETS EVE SLR 
P P R F FNEDG V I RP YRLRDGTGNQM LQ KIQFYRE W I MTHS S S S DD 
DDDDDDDDESDLNH 


6239 


2108 


634 


KPGMAGKGSSGRRPLLLGLLVAVATVHLVICPYTKVEESFNLQA 
THDLLYHWQDLEQYDHLEFPGWPRTFLGPWIAVFSSPAVYVL 
SLLEMS KFYSQL I VRGVLGLGVI FGLWTLQKEVRRHFGAMVATM 
FCWTAMQFHLMFYCTRTLPNVLALPVVLLAIiAAWLRHEWARFI 
WLSAFAI I VFRVELCLFLGLLLLLALGNRKVS WRALRHAVPAG 
ILCLGLTVAVDS YFWRQLT WPEGKVLWYNTVLNKS SNWGTS PLL 
WYF YS AL PRGLGCS LLF I P LGLVDRRTHAPT VLALG FMAL YS LL 
PHKELRF 1 1 YAFPMLN I TAARGCS YLLNNYKKS WLYKAGSLLVI 
GHL WNAAYS ATAL YVSH FNYPGG VAMQRLHQL VP PQTDVLIiHI 
DVAAAQTGVSRFLQVNSAWRYDKREDVQPGTGMLAYTHILMEAA 
PGLLAL YRDTHRVLAS WGTTGVS LNLTQL P P FNVHLQT KL VLL 
ERLPRPS 


6240 


2202 


1176 


HERGDSLKEPTS IAESSRHPSYRSEPSLEPESFRSPTFGKSFHF 
DPLSSGSRSSSLKSAQGTGFELGQLQSIRSEGTTSTSYKSLANQ 
TRNGSLSYDSLLTPSDSPDFESVQAGPEPDPPLGYTSPFLSARL 
AQQREAERHPRLVPTGPTHREPSPVRYDNLSRHIVASLQEREKL 
LRQ S P PLPGREE EPGLGDSG I QST PGS GHAPRTS S S S DDS KRS P 
lAsKTPlAjKP/WPKr IjKPIajJjRGRGVGoPBPGP 1AP i LGRSMSYS 
SQKAQPGVSETEEVALQPLLTPKDEVQLKTTYSKSNGQPKSLGS 
ASPGPGQPPIiSSPTRGGVKKVSGVGGTTYEISV 


6241 


3 


1341 


RNAEE KKRLS LQRE KI IARVS I DNRTRALVQALRRTTD PKLC I T 
RVEELTFHLLEFPEGKGVAVKERI I PYLLRLRQIKDETLQAAVR 
EILALIGYVDPVKGRGIRILSIDGGGTRGWALQTLRKLVELTQ 
KPVHQLFDYICGVSTGAILAFMLGLFHMPLDECEELYRKLGSDV 
FSQNV I VGTVKMS WSHAF YDS QTWEN I LKDRMGS ALM I ETARNP 
TCPKVAAVSTIVNRGITPKAFVFRNYGHFPGINSHYLGGCQYKM 
WQA IRAS S AAPG YFAE YALGNDLHQDGGLLLNNPSALAMHE CKC 
LWPDVPLECIVS LGTGRYES DVRNTVT YTS LKTKLSNVINSATD 
TEE VH I M LDGLL P PDT YFRFNP VMCEN I P LDE SRNE KLDQLQLE 
GUCYIERNEQKMKKVAKILSQEKTTLQKINDWIKLKTDMYEGLP 
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FFSKL 


6242 


198 


1310 


QHFLPGAETWSPGAAVCTARRFPGRSLAAFPRPAAPRRAVEMGE 
SSED I DQMFSTLLGEMDLLTQS LGVDTLPP PDPNPPRAE FNYSV 
GFKDLNESLNALEDQDLDALMADLVADISEAEQRTIQAQKESLQ 
NQHHS AS LOAS I FSG AAS LGYGTNVAATG I fin YPnni, p p p da n d 
VLDLPLPPPPPEPLSQEEEEAQAKADKIKLALEKLKEAKVKKLV 
VKVHMNDNS TKSLMVDERQLARDVLDNLFEKTHCDCNVDWCLYE 
I Y P ELQ I ERF FE DHENVVE VL S D WTREyTENK T T . pt .P Tf ft? k va\ri? 
KNPQNFYLDNRG KKES KETNEKMNAKNKESLLEVRL I LQSGRKE 
KDVCS I FKS FASENNGKI 


6243 


1509 


614 . 


RSASRFSGCWSRDSTCCCCPSTCWSRSSASCPRARWPPSSAPAT 
TSRAS SRRLACGPQTRAGAETRSTAMIRANSAARDTRRATCRSA 
AGT P S P TTMTCLTDVP TGCAAVE PTARLPAAAWASTI TTGCCPA 
MGQAGAGPAGRKGSEAGGGPGRAHHAHPS PLPRE PRVRTGPPAH 
SPTPGSIDPSPELSWGSAGVTQESPLLDPVDFLLFRTRAVDPLR 
RVFFFFYQHLTFFSIQPQPPPCHAFHPRDPPAGTKRQLILVPLK 
GP P I LAP I LS LTP I LS R WS C YF PR <3 R T ACldWUl . <! 


6244 


2119 


1745 


FEHAYASQFGTFLGNNESERCKLKLQQKTMSLWSWVNQPSELSK 
FTNPLFEANNLVIWPSVAPQSLPLWEGIFLRVJNRSSKYLDEAYE 
EMVNI I EYNKELQAKVNILRRQLAELETEDGMQES P 


6245 


81 


1148 


LSLRNAKYSFPQELISLFSMTDLNDNICKRYIKMITNIVILSLI 
ICISLAFWIISMTASTYYGNLRPISPWRWLFSVWPVLIVSNGL 
KKKSLDHSGALGGLWGFILTIANFSFFTSLLMFFLSSSKLTKW 
KGE VKKRIiDS E Y KEGGORNWVOVFCWG AVPT R T.aT.T ,VM T fhp. dp 

E I PVDFS KQYSASWMCLSLLAALACSAGDTWASE VGPVLS KS SP 
RL I TTWE KV P VGTNGG VTWGLVS S LLGGT FVG I AYFLTQL I FV 
NDLDISAPQWPIIAFGGLAGLLGSIVDSYLGATMQYTGLDESTG 
MVVNSPTNKARHIAGKPILDNNAVNLFSSVLIALLLPTAAWGFW 
PRG 


6246 


1177 


359 


S LW P W I LMDDS LMQ I SLQLLC VYTANFPNGCS SLC W S S CGQH P V 
QATHRGAVSNS LMLC I L KLASQMP LENTTVQQMVFMLLS NLALS 
HDC KG VI QKSNFLQNFLS LALPKGGNKHLSNLT I L WLKLLLN X S 
SGEDGQQMILRLDGCLDLLTEMSKYKHKSSPLLPLLIFHNVCFS 
PANKP K I LANEKV I T VLAACLE SENQNAQRIGAAAL WAL I YNYQ 
KAKTALXS PS VKRR VDEA YS LAKKTF PNS EANPLNAY YL KCLEN 
LVQLLNSS 


6247 


3 


1678 


NS RVWGP WTE PS AGS LRPM AR K ONR NS KP X.dX .VPT .TnnT cuara 
PGPGRALLEC0HLRSGVPGGRRRKDWS CSLLVAS LAGAFGS S FL 
YG YNLS WNAPT P Y I KAF YNE S WERRHGR P I D PDTLTLLWS VW 
S I FAI GGL VGTL I VKM IGKVLGRKHTLLANNG FAI S AALLMACS 
LQAGAFEMLIVGRFIMGIDGGVALSVLPMYLSEISPKEIRGSLG 
QVTAI F I C I GVFTGQLLGLPELLGKES T WP YLFGV I WPAWQIi 
LSLPFLPDS PR YL LIjEKHNEARAVKAFQTFLGKAHVS QEVE E VL 
AESRVQRS I RLVS VLELLRAP YVRWQ WTVI VTMAC YQLCGLNA 
I WFYTNS I FGKAGI PPAKI PYVTLSTGGIETLAAVFSGLVIEHL 
GRRPLLIGGFGLMGLFFGTLTITLTLQDHAPWVPYLSIVGILAI 
I AS F CSGPGG I P F I LTGE F FQQ S QRPAAF 1 1 AGTVNWLSNFAVG 
LLFPFIQKSIiDTYCFLVFATICITGAIYLYFVLPETKNRTYAEI 
SQAFS KRNKAYPPEEKIDSAVTDGKINGRP 


6248 


5£ 


1773 


VP P PRMMAAVPPGL E P WNRVR I P KAGNRS AVTVQN PGAALDLC I 
AAVIKECHIiVILSLKSQTLDAETDVLCAVLYSNHNRMGRHKPHL 
ALKQVEQCLKRLKNMNLEGSIQDLFELFSSNENQPLTTKVCVVP 
SQPWELVLMKVLGACKLLLRLLDCCCKTFLLTVKHLGLQEFII 
LNLVMVGLVSRLWVLYKGVLKRLILLYEPLFGLLQEVARIQPMP 
YFKDFTFPSDITEFLGQPYFEAFKKKMPIAFAAXGINKLLNKLF 
L I NEQS PRAS EETLLG I SKKAKQMKIKVQNNVDLGQ P VKNKRVF 
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KEESSEFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 
VIGTPHAKSFVQRFREAESFTQLSEEIQMAVVWCRSKKLKAQAI 
FLGNKLLKSNRLKHLEAQGTSLPKKLECIKTSICNHLLRGSGIK 
TS KHHLRQRRS QNKFLRRQRKPQRKLQS TL LRE IQQ F S QGTR KS 
ATDTSAKWRLSHCTVHRTDLYPNSKQLLNSGVSMPVI QTKEKM I 
HENLRG IHENETDS WTVMQ INKNSTSGT I KETDD IDDI FALMGV 


6249 


56 


1773 


VP P PRMMAAVP PGLEPWNRVRI P KAGNRS AVTVQNPGAALDLC I 
AAVI KE CHLVTLS LKS QTLDAETD VLCAVL YSNHNRMGRHKPHL 
ALKQVEQCLKRLKNMNLEGSIQDLFELFSSNENQPLTTKVCVVP 
S Q P VVELVLMKVLGACKLLLRLLD CCC KTFLLTVKHLGLQEF 1 1 
LNLVMVGLVSRLWVLYKGVLKRL I LLYEPLFGLLQEVAR I QPMP 
YFKDFTFPSDITEFLGQPYFEAFKKKMPIAFAAKGINKLLNKLF 
LINEQSPRASEETLLGISKKAKQMKINVQNNVDLGQPVKNKRVF 
KEESSEFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 
VIGTPHAKS FVQR FREAESFTQLS EE IQMAWWCRS KKLKAQAI 
FLGNKLLKSNRLKHLEAQGTS LPKKLEC I KTS I CNHLLRGSG I K 
TS KHHLRQRRS QNKFLRRQRKPQRKLQS TLLRE I QQ FSQGTR KS 
ATDTS AKW RLSHCTVHRTDLY PN S KQ LLNS G VSMPVI QTKEKM I 
HENLRGIHENETDSWTVMQINKNSTSGTIKETDDIDDIFALMGV 


6250 


232 


1306 


LAALHIMALPFRKDLEKYKDLDEDELLGNLSETELKQLETVLDD 
LD P ENALL PAG FRQ KNQTS KS TTGP FDREHLL S YLE KEALEHKD 
RED YVPYTGEKKGKI FI PKQKPVQTFTEEKVSLDPELEEALTSA 
SDTELCDLAAI LGMHNL ITNTKFCN I MGSSNGVDQEHFSNWKG 
EKILPVFDEPPNPTNVEESLKRTKENDAHLVEVNLNNIKNIPIP 
TLKDFAKALETNTHVKCFS LAATRSNDPVATAFAEMLKVNKTLK 
S LNVESNF I TGVG I LAL I DALRDNETLAE LK I DNQRQQLG TAVE 
LE MAKMLE ENTN I L KFG YQ FTQQG PRTRAANA I T KNNDL VRKRR 
VEGDHQ 


6251 


62 


972 


T PG SG PMS AWAAAS LS RAAARCLLARG PG VRAAP PRD PRPSHPE 
PRGCGAAPGRTLHFTAAVPAGHNKWSKVRHIKGPKDVERSRIFS 
KLCLNI RLAVKEGGPNPEHNSNLAN I LE VCRS KHMPKSTI ETAL 
KMEKSKDTYLLYEGRGPGGSSLLIEALSNSSHKCQADIRHILNK 
NGGVMAVGARHS FDKKGVI WEVEDREKKAVNLERALEMAIEAG 
AED VKE TEDE EERNVFK F I CDAS S LHQ VRKKLDS LGLCS VS CAL 
E F I PNS KVQLAE PDL EQAAHL IQALS NHEDV I HVYDNI B 


6252 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQTKRKKPRRYWEE '"' 
ETVPTTAGAS PGPPRNKKNRELRPQRPKNAYILKKSRIS KKPQV 
PKKPRE WKNP E S QRGLS GAQD PF PG P APVP VE WQKFCR I DKSR 
KLPHS KAKTRSRLE VAEAEEEETS I KAARSELLLAEE PGFLEGE 
DGEDTAKI CQAD I VEAVD IASAAKHFDLNLRQFGPYRLNYSRTG 
RHLAFGGRRGHVAALDWTKKLMCB INVMEAVRD IRFLHSEALL 
AVAQNRWLHI YDNQG I ELHC I RRCDRVTRLEFL P FH FLLATASE 
TGFLT YLD VS VG K I VAALNARAGRLDVMSQNP YNAV I HLGHS NG 
TVSLWSPAMKEPLAKILCHRGGVRAVAVDSTGTYMATSGLDHQL 
KI FDLRGT YQ PLSTRTL PHGAGHLAFSQRGLLVAGMGD WNI WA 
GQGKAS P PS LEQ P YLTHRLS G P VHGLQ FCPFEDVLG VGHTGG I T 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALLEKVPAELIC 
LDPRALAEVDVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 
S S TAS L VKRKRKVMDEEHRD KVRQS LQQQHHKE AKAKPTGAR PS 
ALDRFVR 


6253 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQTKRKKPRRYWEE 
ETVPTTAGAS PG PPRNKKNRELR PQRP KNAY ILKKS R I S KKPQV 
PKKPREWKNPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHSKAKTRSRLEVAEABEEETSIKAARSELLLAEEPGFLEGE 
DGEDTAKI CQAD I VEAVD IASAAKHFDLNLRQFGPYRLNYSRTG 
RHLAFGGRRGHVAALDWVTKKLMCE INVMEAVRD IRFLHSEALL 
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Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AVAQNRWLHIYDNQGIELHCIRRCDRWRLEFLPFHFLLATASE 
TGFLTYLDVSVGKIVAALNARAGRLDVMSQNPYNAVIHLGHSNG 
TV5?LWS PAMKEPLAKI LCHRGGVRAVAVDSTGTYMATSGLDHQL 
K I FDLRGTY Q PLSTRTLPHGAGHLAFS QRGLL VAGMGDWNI WA 
GQG KAS P P SL EQP YLTHRLS GP VHGLQ FCP FED VLG VGHTGG I T 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALLEKVPAELIC 
LDPRALAEVDVISLEQGKKEQIERIiGYDPQAKAPFQPKPKQKGR 
SSTASLVKRKRKVMDEEHRDKVRQSLQQQHHKEAKAKPTGARPS 
ALDRFVR 


6254 


155 


1139 


HALG RRGG S QELSAAACGC FALRLRAPGS GR PALAPGAAAFAGL 
GGAPRFPPRGSAAGRTMLLKEYRICMPLTVDEYKIGQLYMISKH 
S HEQ S DRG EG VE WQNE P FED PHHGNGQ FTE KRVYLNS KLP S WA 
RAWPKI F YVTEKAWNYYPYTITE YTCS FLPKFS IHI ETKYEDN 
KGSNDTIFDNEAKDVEREVCFIDIACDEIPERYYKESEDPKHFK 
SEKTGRGQLREGWRDSHQPIMCSYKLVTVKFEVWGLQTRVEQFV 
HKWRDIIiLI GHRQAFAWVDE W YDMTMDDVRE YEKNMHEQTNI K 
VCNQHS S P VDD I ES HAQTS T 


6255 


1 


1444 


PTRPQQELLVSLATVI FVASQKALSVESKAVI KQQLES VSNGWT 
VYR I ARQAS RMGNHDMAKELYQS LLTQ VAS KH F YFWLNS LKEFS 
HAEQCLTGLQEENYSSALSCIAESLKFYHKGIASLTAASTPLNP 
LS FQCE F VKLR I DLLQAFSQL I CT CNS LKTS P P PAI ATT I AMTL 
GKDLQRCGRISNQMKQSMEEFRSLASRYGDLYQASFDADSATLR 
NVELQQQS CLL I SHAI EALI LDPESAS FQE YGSTGTAHADS E YE 
RRMMSVYNHVLEEVESLNGKYTPVSYMHTACLCNAIIALLKVPL 
S FQRYF FQKLQS TS IKLALS PS PRNPAEP IAVQNNQQLALKVEG 
WQHGSKPGLFRKIQSVCLNVSSTLQSKSGQDYKIPIDNMTNEM 
EQRVE PHND YFS TQFLLNFAILGTHN I TVES S VKDANG I VWKTG 
PRTTIFVKSLEDPYSQQIRLQQQQAQQPLQQQQQRNAYTRF 


6256 


1 


1542 


CRGAGAEPAANPRSPRSLVPSLESTSTSVPPAPGTMATDSWALA 
VDEQEAAAE S LSNLHL KE EK I KPDTNGAWKTNANAE KTDE EE K 
EDRAAQS LLNKL I RSNLVDNTNQVEVLQRDPNS PLYS VKS FEEL 
RLKPQLLQdVYAMGFNRPSKIQENALPLMLAEPPQNLIAQSQSG 
TGKTAAFVIAMLSQVEPANKYPQCLCLSPTYELALQTGKVIEQM 
GKFYPELKLAYAVRGNKLERGQKISEQIVIGTPGTVLDWCSKLK 
FIDPKKIKVFVLDEADVMIATQGHQDQSIRIQRMLPRNCQMLLF 
SATFEDSVWKFAQKWPDPNVIKLKREEETLDT I KQYYVLCS SR 
DEKFQALCNLYGAITIAQAMIFCHTRKTASWLAAELSKEGHQVA 
LLSGEMMVEQRAAVIERFREGKEKVLVTTWCARGIDVEQVSW 
INFDL P VDKDGNPDNETYLHR I GRTGR FGKRGLAVNM VDS KH S M 
NI LNR I QEHFNKKI ERLDTDDLDE I E KI AN 


6257 


210 


615 


AF I PAMAE L I Q KKLQGE VE KYQQLQKDLS KS MSGRQKLEAQLTE 
NNI VKEELALLDGSNWFKLLGPVLVKQELGEARATVGKRLDY I 
TAEIKRYESQLRDLERQSEQQRETLAQLQQEFQRAQAAKAGAPG 
KA 


6258 


210 


615 


AFIPAMAELIQKKLQGEVEKYQQLQKDLSKS MSGRQKLEAQLTE 
NNIVKEELAIiDGSNVVFKLLGPVLVKQELGEARATVGKRLDYI 
TAEIKRYESQLRDLERQSEQQRETLAQLQQEFQRAQAAKAGAPG 
KA 


6259 


2 


1540 


I LEKG F P S Q CH PERKW KVDD VLES SQENEDDHFWELLFHNNKT V 
SVENGDRGSKTFNLGTDPVSLRNYPYKICDSCEMNLKNISGLII 
S KKNCS RKKPDE FNVCEKLLLD I RHEKI P IGE FCS YKYDQKRNAI 
NYHQDLSQPS FGQS FEYSKNGQGFHDEAAFFTNKRSQI GETVCK 
YNECGRTFIESLKLNISQRPHLEMEPYGCSICGKSFCMNLRFGH 
QRALT KDNP YE YNE YGE I FCDNS AFI IHQGAYTRKI LRE YKVS D 
KT WEKSALLKHQ I VHMGGKS YD YNENGSNF SKKSHLTQLRRAHT 
GEKTFECGECGKTFWEKSNLTQHQRTHTGEKPYECTECGKAFCQ 
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ID 
NO: 
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beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, Islsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V- Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPHLTNHQRTHTGEKPYECKQCGKTPCVKSNLTEHQRTKTGEKP 
YECNACGKS FCHRSALTVHQRTHTGEKPFI CNECGKS FCVKSNL 
IVHQRTHTGEKPYKCNECGKTFCEKSALTKHQRTHTGEKPYECN 
ACGKTFSQRSVLTKHQRIHTRVKALSTS 


6260 


2081 


1436 


GTGPEIHACAHASARAPGSRAMALRELKVCLLGDTGVGKSSIVW 
RFVEDSFDPNINPTIGASFMTKTVQYQNELHKFLIWDTAGQERF 
RALAPMYYRGSAAAIIVYDITKEETFSTLKNWVXELRQHGPPNI 
WAIAGNKCDL IDVREVMERDAKD YADS IHAI FVETSAKNAINI 
NELFIEISRRI PSTDANLPSGGKGFKLRRQPSEPKRSCC 


S261 


3 


1188 


FWYRLGPGTRSRWPRRGSWAASLVPRGPSPAALVTSPCPPDPLR 
S PACE PCRPDFAPRPALLLRSGPRSAPAVTGKPALKGQ PGP WPG 
MAEVSIDQSKLPGVKBVCRDFAVLEDHTLAHSLQEQE I EHHLAS 
NVQRNRLVQHDLQVAKQLQEEDLKAQAQLQKRYKDLEQQDCEIA 
QEIQEKLAIEAERRRIQEKKDEDIARLLQEKELQEEKKRKKHFP 
E FPATRA YADS YYYEDGGMKPRVMKEAVST P S RMAHRDQE W YDA 
E IARKLQEEELLATQVDMRAAQVAQDE E IARLLMAEEKKAYKKA 
KEREKSSLDKRKQDPEWKPKTAKAANSKSKESDEPHHSKNERPA 
RP P P P I MTDG EDAD YTHFTNQQS STRHFSKS E S SHKG FHYKH 


6262 


2 


1759 


PECHSQGLCSVHRPGKVPQARMSGLVLGQRDEPAGHRLSQEEIL 
GSTRLVSQGLEALRSEHQAVIjQSLSQTIECLQQGGHEEGLVHEK 
ARQLRRSMEN I ELGLS EAQVMLALAS HLST VE S E KQKLRAQ VRR 
LCQENQWLRDELAGTQQRLQRSEQAVAQLEEEKKHLEFLGQLRQ 
YDEDGHTSEEKEGDATKDSLDDLFPNEEEEDPSNGLSRGQGATA 
AQQGGYEIPARLRTLHNLVIQYAAQGRYEVAVPLCKQALEDLER 
TSGRGHPDVATMLNILALVYRDQNKYKEAAHLLNDALSIRESTL 
GPDHPAVAATLNNIaAVLYGKRGKYKEAEPLCQRALEIREKVIiGT 
NHPDVAKQLNNLALLCQNQGKYEAVERYYQRALAI YEGQLG PDN 
PNVARTKNNLASCYLKQGKYAEAETLYKEILTRAHVQEFGSVDD 
DHKPIWMHAEEREEMSKSRHHEGGTPYAEYGGWYKACKVSSPTV 
NTTLRNDGALYRRQGKLEAAETLEECALRSRRQGTDPISQTKVA 
ELLGESDGRRTSQEGPGDSVKFEGGEDASVAVEWSGDGSGTLQR 
SGSLGKIRDVLRR 


6263 


1 


2408 


REIiDSLADLPERIKPPYANGLSTSHLRSSSVEDVKLIISEGRPT 
IEVRRCSMPSVICEHTKQFQTISEESNQGSLIiTVPGDTSPSPKP 
EVFSNVPERDLSNVSN IHSS FATS P TGASNS KYV SADRNL I KNT 
APVNTVMDSP VHLEPS SQ VGVIQNKSWEMPVDRLETLSTRDF I C 
PNS N I PDQES S LQS FCN S ENKVLKENAD FLSLRQTEL PGNS CAQ 
DPAS FMPPQQPCS FPSQS LSDAES I SKHMSLS YVANQEPG ILQQ 
KNAVQ IIS SALDTDNE S T KDTENTFVLGDVQKTDAF VP VYSDS T 
IQE AS PNFE KAYTL P VLP S E KDFNG S D ASTQLNTHYAFS KLT YK 
SS S GHE VENS TTDTQ VI S HE KENKLES LVLTHLSRCDSDIjCEMN 
AGMPKGNLNEQDPKHCPESEKCLLSIEDEESQQSILSSLENHSQ 
QSTQPEMHKYGQLVKVELEENAEDDKTENQIPQRMTRNKANTMA 
NQSKQILASCTLLSEKDSESSSPRGRIRLTEDDDPQIHHPRKRK 
VSRVPQP VQVS PS LLQAKEKTQQSIAAIVDSLKLDE IQPYSS ER 
ANPYFEYLHIRKKIEEKRKLLCSVIPQAPQYYDEYVTFNGSYLL 
DGNPLSKICIPTITPPPSLSDPLKELFRQQEWRMKLRLQHSIE 
REKLIVSNEQEVLRVHYRAARTIiANQTLPFSACTVLLDAEVYNV 
PLDSQSDDSKTSVRDRFNARQFMSWLQDVDDKFDKLKTCLLMRQ 
QHEAAALNAVQRLEWQLKLQELDPATYKSISIYEIQEFYVPLVD 
VNDDFELTPI 


6264 


143 


1960 


KHRQENNALDMAPE IHMTGPMCL IBNTNGEL VANPEAL K I LS A I 
TQPWWAI VGL YRTGKS YLMNKLAGKNKGFSLGSTVKSHTKG I 
WMWCVPHPKKPEHTLVLLDTEGLGDVKKGDNQNDSWIFTLAVLL 
S STLVYNS MGT I NQQAMDQL YYVTELTHR I RS KS S PDENENEDS 
ADFVSFFPDFVWTLRDFSLDLEADGQPLTPDEYLEYSLKLTQGT 
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nucleotide 
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to first 
amino acid 
residue of 
amino acid 
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Predicted end 
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corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R«Arginine, 1 
S»Serine, TVThreonine, V«Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\s=possible nucleotide insertion) 








SQKDKNFNLPRLCIRKFFPKKKCFVFDLPIHRRKLAQLEKLQDE 
ELDPE FVQQVADFCS YI FSNS KTKTLSGG I KVNGPRLESLVLT Y 
INAISRGDLPCMENAVLALAQ I ENSAAVQKAIAHYDQQMGQKVQ 
LPAETLQELLDLHRVSEREATEVYMKNS FKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEAS SDRCS ALLQVI FS PLEEEVKAG I YSKPGG 
YCL F I QKLQDLEKK YYE E PRKG I Q AEE I LQT YLKS KES VTDAI L 
QTDQ ILTEKEKE I EVECVKAESAQASAKMVEEMQ I KYQQMMEEK 
EKSYQEHVKQLTEKMERERAQLLEEQBKTLTSKLQEQARVLKER 
CQGESTQLQNE IQKLQKTLKKKTKR YMSHKLKI 


6265 


143 


1960 


KHRQENNALDMAPBIHMTGPMCLIENTNGELVANPEALKILSAI 
TQPWWAIVGLYRTGKSYLMNKLAGKNKGFSLGSTVKSHTKGI 
WMWCVPHPKKPEHTLVLLDTEGLGDVKKGDNQNDSWIFTLAVLL 
SS TL VYNS MGT INQQAMDQL YY VTELTHR I RS KS S PDENENEDS 
ADFVSFFPDFVWTLRDFSLDLEADGQPLTPDEYLEYSLKLTQGT 
S Q KDKNFNLPRLC IRKFFPKKKCFVFDLPI HRRKLAQLEKLQDE 
ELD PEFVQQVADFCSY I FSNS KTKTLSGG I KVNGPRLESLVLTY 
INAI SRGDLPCMENAVLALAQ I ENS AAVQKAIAH YDQQMGQKVQ 
LPAETLQELLDLHRVS EREATE VYMKNS FKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEASSDRCSALLQVIFS PLEEEVKAG I YSKPGG 
YCLFIQKLQDLE KKY YEE PR KG I QAEE I LQTYLKS KES VTDAI L 
QTDQILTEKEKEIBVECVKAESAQASAKMVEEMQIKYQQMMEEK 
EKSYQEHVKQLTEKMERERAQLLEEQEKTLTSKLQEQARVLKER 
CQGESTQLQNE IQKLQKTLKKKTKRYMSHKLKI 


6266 


276 


1421 


GSHQKQMLVPCFLYS LQNRKPS LYGSLTCQG I GLDG I PE VTAS E 
GFTVNEINKKS IHI S CPKENASSKFLAPYTTFSRIHTKS ITCLD 
I SSRGGLGVS SSTDGTMKI WQASNGELRRVLEGHVFDVNCCRFF 
PSGLWLSGGMDAQLKI WSAEDASC WTFKGHKGGI LDTAIVDR 
GRNWSASRDGTARLWDCGRSACLGVLADCXjSSINGVAVGAADN 
S I NLGSP EQMP SERE VGTEAKMLLLARED KKLQCLGLQS RQLVF 
LFIGSDAFNCCTFLSGFLLLAGTQDGNIYQLDVRSPRAPVQVIH 
RSGAPVLSLLSVRDGFIASQGDGSCFIVQQDLDYVTELTGADCD 
P VYKVATWEKQ I YTCCRDGLVRRYQLSDL 




3 


622 


LGMMKKNNSAKRGPQDGNQQPAPPEKVGWVRKFCGKGIFREIWK 
NRYWLKGDQLYISEKEVKDEKNIQEVFDLSDYEKCEELRKSKS 
RSKKNHSKFTLAHSKQPGNTAPNLIFLAVSPEBKESWINALNSA 
I TRAKNR I LDE VTVEEDS YLAHPTRDRAKI QHS RRP PTRGHLMA 
VASTSTSDGMLTLDLIQEEDPS PEEPTSLC 


6268 


160 


136^8 


HRE LCQNL P AGLS SAL I DNPLTLLL S I DT YVMLQE P VT FQD VAV 
DFS REEWGLLGPTQRTE YRDVMLET FGHLVS VGWETTLENKELA 
PNS DI PEEE PAPS LKVQES S RDCALS STLEDTLQGGVQE VQDTV 
LKQMESAQEKDLPQKKHFDNRESQANSGALDTNQVSLQKIDNPE 
SQANSGALDTNQVLLHKI PPRKRLRKRDSQVKSMKHNSRVKIHQ 
KSCERQKAKEGNGCRKTFSRSTKQITFIRIHKGSQVCRCSECGK 
I FRNPRYFS VHKKIHTGERP YVCQDCGKGFVQS SSLTQHQRVHS 
GE R P FECQE CGRTFNDRS A I SQHLRTHTGAKP YKCQDCGKAFRQ 
SSHLIRHQRTHTGERPYACNKCGKAFTQSSHLIGHQRTHNRTKR 
KKKQPTS 


6269 


2886 


1449 


HASAPTRRNMAAASPLRDCHAWKDARLPLSTTSNEACKLFDATL 
TQ YVKWTND KS LG G I EG CLS KLKAAD PTFVMGHAMATGLVL I GT 
GSSVKLDKELDLAVKTMVE I SRTQPLTRREQLHVSAVETFANGN 
FP KACE LWEQ I LQDHPTDMLALKFSHDAYFYLGYQEQMRDS VAR 
I YP FWTPDI PLS S YVKG I YS FGLMETNFYDQAEKLAKEALS INP 
TDAWSVHTVAHIHE^4KAEI KDGLEFMQHSETLWKDSDMLACHNY 
WH WAL YLI E KG E YEAALT I YDTH I LP S LQANDAMLDWDS CS ML 
YRLQMEGVSVGQRWQDVLPVARKHSRDHILLFNDAHFLMASLGA 
HDPQTTQELLTTLRDASESPGENCQHLLARDVGLPLCQALVEAE 
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Amino acid segment containing signal peptide 
(A=»Alanine, OCysteine # D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X»Unknovm, *=Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








DGNP DR VLE LLL P I R YR I VQLGGSNAQRD VFNQLL I HAALNCTS " 
SVHKNVARSLLMERDALKPNSPLTERLIRKAATVHLMQ 


6270 


23 


2086 


S VTVTLGS EGDGRP P TYHLEEM EQEPQNGE PAE I K 1 1 REAYKKA 
FL FVNKG LNTDE LGQ KEEAKN Y YKQG I GHLLRG I S I S S KE S EHT 
GPGWESARQMQQKMKETLQNVRTRLE I LE KGLATS LQNDLQE VP 
KLYPE F P P KDMCE KL PE PQ S F S S APQHAE VNGNTS TP S AGAVAA 
PAS LS LP S QS C PAEAPPAYT P QAAEGH YT VS YGTDSGE FSS VGE 
EFYRNHSQPPPLETLGLDADELILIPNGVQIFFVNPAGEVSAPS 
YPGYLR I VRFLDNSLDTVLNR PPGFLQ VCD WLYPLVPDRSPVLK 
CTAGAYMFPDTMLQAAGCFVGWLSSELPEDDRELFEDLLRQMS 
DLRLQ ANWNRAEEENE FQI PGRTR PS S DQLKE ASGTDVKQLDQG 
NKDVRHKGKRGKRAKDTSSEEVNLSHIVPCEPVPEEKPKELPEW 
SEKVAHNILSGASWVSWGLVKGAEITGKAIQKGASKLRERIQPE 
E KP VEVS P AVT KGL Y I AKQATGGAAKVS Q FLVD GVCT VANCVGK 
ELAPHVKKHGSKLVPESLKKDKDGKSPLDGAMWAASSVQGFST 
VWQGLECAAKC I VNNVS AETVQT VRYKYG YNAG EATHHAVDS AV 
NVGVTAYNINNIGIKAMVKKTATQTGHTLLEDYQIVDNSQRENQ 
EGAANVNVRGEKDEQTKEVKEAKKKDK 


6271 


32 


1058 


GCGVKTAGMVG REKE LS IH FV PG S CRLVE EE VN I PNRR VLVTGA 
TGLLGRAVHKE FQQNNWHAVG CG F RRARP KFE Q VNLLD S NAVHH 
I IHDFQPHVI VHCAAERRPDWENQPDAASQLNVDASGNLAKEA 
AAVGAFL I Y I S S D YVFD GTNP P YREEDI PAP LNLYG KT KLDGE K 
AVLENNLGAAVLRI P I L YGEVEKLEESAVTVMFDKVQFSNKSAN 
MDHWQQRFPTHVKDVATVCRQLAEKRMLDPS I KGTFHWSGNEQM 
TKYEMACAI ADAFNLPS SHLRP I TDS PVLGAQRPRNAQLDCSKL 
ETLGIGQRTPFRIGIKESLWPFLIDKRWRQTVFH 


6272 


1136 


528 


GAVMEDAAAPGRTEGVLERQGAPPAAGQGGALVELTPTPGGLAL 
VS P YHTHRAGD PLDLVALAEQVQ KADE FI RANATNKLTV I AEQ I 
QHLQEQARKVLEDAHRDANLHHVACNI VKKPGN I YYLYKRESGQ 
QYFS I ISPKEWGTSCPHDFLGAYKLQHDLSWTPYEDIEKQDAKI 
SMMDTLLSQSVALPPCTEPNFQGLTH 


6273 


256 


843 


SCPRVS PECRSLGCQVMFSLPLNCS PDHIRRGS CWGRPQDLKIA 
SAAWNS KCHPGAGAAMARQHARTL W YDRPR YV FME FCVE DS TDV 
HVL I EDHR I VFS CKNADG VEL YNE I E F YAKVNS KDSQD KRS S RS 
ITCFVRKWKEKVAWPRLTKEDIKPVWLSVDFDNWRDWEGDEEME 
LAHVEHYAEVRDNT YCVL PT 


6274 


56 


1142 


AAAAMAAAAGGGAGAARSLSRFRGCLAGALLGDCVGSFYEAHDT 
VDLTSVLRHVQSLEPDPGTPGSERTEALYYTDDTAMARALVQSL 
LAKE AFD E VDMAHR FAQ E YKKD PDRG YGAG WTVF KKLLNP KCR 
DVFE PARAQFNGKGS YGNGGAMRVAG I SLAYS S VQDVQKFARLS 
AQLTHASSLGYNGAILQALAVHLALQGESSSKHFLKQLLGHMED 
LEGDAQSVLDARELGMEERPYSSRLKKIGELLDQASVTREEWS 
ELGNGIAAFES VPTAI YCFLRCMEPDPE I PSAFNS LQRTLIYS I 
SLGGDTDTI ATMAGAIAGAYYGMDQVPES WQQS CEGYEETD I LA 
QSLHRVFQKS 


6275 


20 


565 


SRRGRARCLARGSRRPVPRPAKTMAFMVKTMVGGQLKNLTGSLG 
GGEDKGDGDKSAAEAQGMSREEYEEYQKQLVEEKMERDAQFTQR 
KAERATLRSHFRDKYRLPKNETDESQ IQMAGGDVE L PRELAKM I 
EEDTEEEEEKASVLGQLASLPGLNLGSLKDKAQATLGDLKQSAE 
KCHVM 


6276 


797 


97 


TLLPLPPLPDTEGMILLNTGLEGTVAENPVPIVHTPSGNILTLE 
SCLQQLATHPGHWGIHLQIAEPAALRPSLALLARLSSLGLLHWP 
VWVGAKISHGSFSVPGHVAGRELLTAVAEVFPHVTVAPGWPEEV 
LGSGYREQLLTDMLELCQGLWQPVSFQMQAMLLGHSTAGAIGRL 
LASS PRATVTVEHNPAGGDYAS VRTALLAARAVDRTRVYYRLPQ 
GYHKDLLAHVGRN 
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Amino acid segment containing signal peptide™" 
(A=Alanine r C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6277 


4600 


2744 


MAFRTE MGL Y YS YFKT I VE APS FLNG VWM IMNDKLTE YPLV INT 
LKRFNLYPEVILASWYRIYTKIMDLIGIQTKICWTVTIGEGLSP 
TES CEGLGDPACFYVAVI FI LNGLMMALFFI YGT YLSGS RLGGL 
VTVLCFFFNHGECTRVMWTPPLRESFSYPFLVLQMLLVTHILRA 
TKLYRGSLIALC ISNVFFMLP WQFAQFVLLTQ I ASLFAVYWGY 
IDICKLRKIIYIHMISLALCFVLMFGNSMLLTSYYASSLVIIWG 
I LAMKPHFL KI NVS ELSLWVI QGC FWLFGTVI L KYLTS K I FG I A 
NDAH IGNLLTS KFFS YKDFDTLLYTCAAEFDFME KETPLR YTICT 
LLLPWLVGFVAIVRKI ISDMWGVLAKQQTHVRKHQFDHGELVY 
HALQLLAYTALG ILIMRLKLFLTPHMCVMASLI CSRQLFGWLFC 
KVHPGAIVFAILAAMSIQGSANLQTQWNIVGEFSNLPQEELIEW 
IKYSTKPDAVFAGAMPTMASVKLSALRPIVNHPHYEDAGLRART 
KIVYSMYSRKAAEEVKRELIKLKVNYYILEESWCVRRSKPGCSM 
PEIWDVEDPANAGKTPLCNLLVKDSKPHFTTVFQNSVYKVLEW 
KB 


6278 


3 


823 


ILFRLVLLSLVYLLNSVATEERKPAEVLIVEGQQYAWGTVLLL 
IRI ILEYCQGVDNIPSVTTDMLTRLSDLLKYFNSRS CQLVLGAG 
ALQWGLKTITTKNLALSSRCLQLIVHYIPVIRAHFEARLPPKQ 
YSMLRHFDHITKDYHDHIAEISAKLVAIMDSLFDKLLSKYEVKA 
P VPS AC FRNI CKQMTKMHEAI FDLLPE E QTQMLFLR INAS YKLH 

LKKQLSHLNTVINDGGPQNGLVTADVAFYTGNLQALKGLKDLDLN 
MAEIWEQKR 


6279 


127 


1687 


GGAMASDGARKQFWKRSNSKLPGSIQHVYGAQHPPFDPLLHGTL " 
LRSTAKMPTTPVKAKRVSTFQEFESNTSDAWDAGEDDDELLAMA 
AESLNSEWMETANRVLRNHSQRQGRPTLQEGPGLQQKPRPEAE 
PPSPPSGDLRLVKSVSESHTSCPAESASDAAPLQRSQSLPHSAT 
VTLGGTSDPSTLSS SALS EREASRLDKFKQL LAG PNTDLEELRR 
LSWSGIPKPVRPMTWKLLSGYLPANVDRRPATLQRKQKEYFAFI 
EHYYDSRNDEVHQDTYRQ IH I D I PRMS PEALI LQPKVTE I FER I 
LFI WAIRHPASG YVQG INDLVTPFFWFI CE YIEAEE VDTVDVS 
GVPAEVLCNI EADTYWCMSKLLDG I QDN YT FAQPG I QMKVKMLE 
ELVSRIDEQVHRHLDQHEVRYLQFAFRWMNNLLMREVPLRCTIR 
LWDTYQS E PDGFS HFHL YVCAAFL VRWRKE I LEEKD FQELLLFL 
QNLPTAHWDDEDISLLLAEAYRLKFAFADAPNHYKK 


6280 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGr J PRRGAfiTiRRc;pp , PT7T?pnp 
DVDLAQVLAYLLRRGQVRLVQGGGAANLQF I QALLDS E EENDRA 
WDGRLGDRYNPPVDATPDTRELEFNE I KTQVELATGQLGLRRAA 
QKHSFPRMLHQRERGLCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YS Q KAFCG I YS KDGQ I FMS ACQDQT I RL YDCR YGRFRKFKS I KA 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIHICNIYGEGDTHTALD 
LRPDERRFAVFSIAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
I E SHE DDVNAVAFAD I S S Q I LFS GGDDAIC KVWDRRTMREDDPK 
PVGALAGHQDGITFI DSKGDARYLI SNSKDQTI KLWD IRRFSSR 
EGMEASRQAATQQNWDYRWQQVPKKAWRKLKL PGDSS LMT YRGH 
GVLHTLIRCRFSPIHSTGQQFIYSGCSTGKVVVYDLLSGHIVKK 
LTNHKACVRDVSWHPFEEKIVSSSWDGNLRLWQYRQAEYFQDDM 
PESEECASAPAPVPQSSTPFSSPQ 


6281 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEEEEEDE 
DVDLAQVLAYLLRRGQVRLVQGGGAANLQFIQALLDSEEENDRA 
WDGRLGDRYNPPVDATPDTRELEFNE I KTQVELATGQLGLRRAA 
QKHSFPRMLHQRERGLCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YS QKAFCG 1 YS KDGQ I FM S ACQDQT IRL YDCR YGRFRKFKS I KA 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIHICNIYGEGDTHTALD 
LRPDERRFAVFSIAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
IESHEDDVNAVAFADISSQILFSGGDDAICKVWDRRTMREDDPK 
PVGALAGHQDG I TFI DSKGDARYLISNSKDQTI KLWDIRRFS S R 
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Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V*Valine, 
W-Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion; 
\=possible nucleotide insertion) 








EGMEASRQAATQQNWDYRWQQVPKKAWRKLKLPGDSSLMTYRGH 
GVLHTL I RCR FS P I HS TGQQF I YS G CSTG KVWYD LLSGH I VKK 
LTNHKACVRDVSWHP FEEKIVS SS WDGNLRLWQYRQAEYFQDDM 
PESEECASAPAPVPQSSTPFSSPQ 


6282 


125 


906 


RMAACRALKAVLVDLSGTLHIEDAAVPGAQEALKRLRGASVI IR 
FVTNTTKESKQDLLERLRKLEFDISEDEIFTSLTAARSLIiERKQ 
VRPMLLVDDRALPDFKG I QTSD PNAWMGliAPEHFHYQ I LNQAF 
RLLLDGAPLIAIHKARYYKRKDGLALGPGPFVTALEYATDTKAT 
WGKPEKTFFLEALRGTGCEPEEAVMIGDDCRDDVGGAQDVGML 
G I L VKTG KYRAS DEEK I NP PP YLTCES FPHAVDH I LQHLL 


6283 


140 


1043 


LS L FG I HVMNP FWSMS TS S VRKRS EG EEKTLTGD VKTS P PRTAP ' 
KKQLPS I PKNALPITKPTS PAPAAQSTNGTHAS YGPFYLE YSLL 
AE FTLWKQ KL PG VYVQ P S YRSALMW FGV I F I RHGLYQDG VFKF 
TVY I PDNYPDGDCPRLVFD I PVFHPLVDPTS GELDVKRAFAKWR 
RNHNH I WQ VLM YARRVF YK I DTAS PLN PEAAVL YEKD IQL F KSK 
WDS VKVCTARLFDQPKI EDPYAI S FS PWNPS VHDEAREKMLTQ 
KKK P E EQHNKS VHVAGLS W VKPG S VQ P FS KE E KT VAT 


62B4 


1 


2879 


RSVIPGSTISSRWPGLSRPRFMAAHEWDWFQREELIGQISDIRV 
QNLQVERENVQKRTFTRWINLHLEKCNPPLEVKDLFVDIQDGKI 
LMALLEVLSGRNZjLHEYKSSSHRIFRLNNIAKALKFLEDSNVKL 
VSIDAAEIADGNPSLVLGLIWNIILFFQIKELTGNLSRNSPSSS 
LAPGSGGTDSDSS FPPTPTAERSVAI S VKDQRKAI KALLAWVQR 
KTRKYGVAVQDFAGSWRSGLAFLAVI kaidps lvdmkqalenst 
REMLE KAFS I AQDALH I PRLLEPED I MVDTPDEQS I MT YVAQFL 
ERFPELEAEDIFDSDKEVPIESTFVRIKETPSEQESKVFVLTEN 
GERTYTVNHETSHPPPSKVFVCDKPESMKEFRLDGVSSHALSDS 
STE FMHQI I DQ VLQ GG PGKTS DISBPSPESSILS SRKENGRS NS 
LPIKKTVHFEADTYKDPFCSKNLSLCFEGSPRVAKESLRQDGHV 
LAVEVAEEKEQKQESSKIPESSSDKVAGDIFLVEGTNNNSQSSS 
CNGALESTARHDE ESHSLS PPGENTVMADSFQ I KVNLMTVEALE 
EGDYFEAIPLKASKFNSDLIDFASTSQAFNKVPSPHETKPDEDA 
EAFENHAEKLGKRS IKSAHKKKDSPEPQVKMDKHEPHQDSGEEA 
EGCPSAPEETPVDKKPEVHEKAKRKSTRPHYEEEGEDDDLQGVG 
EELSSSPPSSCVSLETLGSHSEEGLDFKPSPPLSKVSVIPHDLF 
YFPHYEVPLAAVLEAYVEDPEDLKNEEMDLEEPEGYMPDLDSRE 
EEADGSQS S SS SS VPGESLPS ASDQVLYLSRGGVGTTPASEPAP 
IAPHEDHQQRETKENDPMDSHQSQESPNLENIANPLEENVTKES 
ISSKKKEKRKHVDHVESSLFVAPGSVQSSDDLEEDSSDYS I PSR 
TSHSDSSIYLRRHTHRSSESDHFSLCSVEERSRSG 


6285 


2157 


1331 


SCKTENLLEMWWFQQGLSFLPSALVIWTSAAFIFSYITAVTLHH 
I DPALP YI SDTGTVAPEKCLFGAMLNIAAVLCIATI YVRYKQVH 
ALS PEENV 1 1 KLNKAGLVLG I LS CLGLS I VANFQ KTTL FAAHVS 
GAVLT FGMGS L YM FVQTI LS YQMQPKI KGKQ VFW I RLLLVI WCG 
VSALSMLTCSSVLHSGNFGTDLEQKLHWNPEDKGYVLHMITTAA 
EWSMS FS FFGFFLTYI RDFQK I SLRVEANLHGLTLYDTAPCP IN 
NERTRLLSRDI 


6286 


1619 


276 


KAGASCCGSANPYVSVGKSCVLLAMAQLQTRFYTDNKKYAVDDV 
P FS I PAAS E I ADLSN I INKLLKDKNE FHKHVE FD FL I KGQFLRM 
PLDKHMEMENISSBEWEIEYVEKYTAPQPEQCMFHDDWISSIK 
GAE EW I LTGS YDKTS R I WSLEGKS I MT I VGHTD WKDVAWVKKD 
SLSCLLLSASMDQTI LLWEWNVERNKVKALHCCRGHAGSVDS IA 
VDGSGTKFCSGSWDKMLKIWSTVPTDEEDEMEESTNRPRKKQKT 
EQLGLTRTPIVTLSGHMEAVSSVLWSDAEBICSASWDHTIRVWD 
VESG S LKS TLTGNKVFNC I SYS PLCKRLAS GSTDRH I RLWDPRT 
KDG S LVS LS LTS HTGWVTS VKWS PTHEQQL I S GSLDNI VKLWDT 
RS CKAPL YDLAAHEDKVLS VD WTDTGLLLS GGADNKL YS YR YS P 
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TTSHVGA 


6287 


278 


1482 


MQFFFNFQIGLRSTSGKEKYSGDAGFI^DALQLFLQCLALDEDF 
APAKLQVQKILCDLLLPENLKEGLKESSWSSLPCTKNRPFDFHS 
VMEESQSLNEPS PKQSEEI PEVTSEPVKGSLNRAQSAQS INSTE 
MPAREDCLKRVSSEPVLSVQEKGVLLKRKLSLLBQDVIVNEDGR 
NKLKKQGETPNEVCMFSLAYGDIPEELIDVSDFECSLCMRLFFE 
PVTTPCGHS FCKNCLERCLDHAP YCPLCKES LKE YLADRRYCVT 
QLLEELIVKYLPDELSERKKIYDEETAELSHLTKNVPIFVCTMA 
YPWPCPLHVFEPRYRLMIRRSIQTGTKQFGMCVSDTQNSFADY 
GCMLQIRNVHFLPDGRSVVDTVGGKRFRVLKRGMKDGYCTADIE 
YLEDV 


6288 


1 


743 


VTLYPCRGLVGNLLLGASGMASGCKIGPSIIiNSDLANLGAECLR 

MLDSGADYLHLDVMDGHFVPNITFGHPVVESLRKQLGQDPFFDM 

HMMVSKPEQ WKPMAVAGANQ YTFHLEATENPGALI KD I RENGM ■ 

KVGLAIKPGTSVEYLAPWANQIDMALVMTVEPGFGGQKFMEDMM 

PKVHWLRTQFPSLDIEVDGGVGPDTVHKCAEAGANMIVSGSA1M 

RSEDPRSVINLLRNVCSEAAQKRSLDR 


6289 


1 


743 


VTLYPCRGLVGNLLLGASGMASGCKIGPS I liNSDLANLGAECLR 
MLDS GAD YLHLD VMDGHFVPN I TFGHP WES LRKQLGQDPFFDM 
HMMVS KP EQWVKPMAVAGANQ YT FHLEATENPGAL I KD I RENGM 
KVGLAIKPGTSVEYLAPWANQIDMALVMTVEPGFGGQKFMEDMM 
PKVHWLRTQFPSLDIEVDGGVGPDTVHKCAEAGANMIVSGSAIM 
RSEDPRSVINLLRNVCSEAAQKRSLDR 


6290 


3 


1856 


TLGRWLLGVYETVAPTLACLPRPRLRRRRRRRRRRMISRYTRKA 
VPQSLELKGITKHALNHHPPPEKLEEISPTSDSHEKDTSSQSKS 
D I TRESS FTS ADTGNS L S AFPS YTGAGI STEGS SD FS WG YGELD 
QNATEKVQTMFTAIDELLYEQKLS VHTKSLQEE CQQWTAS FPHL 
RILGRQIITPSEGYRLYPRSPSAVSASYETTLSQERDSTIFGIR 
GKKLHFSSSYAHKASS IAKSSSFCSMERDEEDS I IVSEGI IEEY 
LAFDH I D I E EGFHGKKS EAATEKQKLG YP P I AP FYCM KEDVLAY 
VFDSVWCKWSCMEQLTRSHWEGFASDDESNVAVTRPDSESSCV 
LSELHPLVLPRVPQSKVLYITSNPMSLCQASRHQPNVNDLLVHG 
MPLQPRNLSLMDKLLDLDDKLLMRPGSSTILSTRNWPNRAVEFS 
TSSLSYTVQSTRRRNPPPRTLHPISTSHSCAETPRSVEEILRGA 
RVP VAPDS LS S P S P TPLS RNNLL P P I GTAEVEHVS TVG PQRQMK 
PHGDSSRAQSAWDEPNYQQPQERLLLPDFFPRPNTTQSFLLDT 
Q YRRS CAVE YPHQARPGRG S AGPQLHGS TKS Q S GGRP VS RTRQG 
P 


6291 


1732 


602 


LVAKMASSASARTPAGKRVINQEELRRLMKEKQRLSTSRKRIES 
P FAKYNRLGQLS CALCNTP VKSELLWQTHVLGKQHREKVAELKG 
AKEASQGS SASSAPQS VKRKAPDADDQDVKRAKATLVPQVQPS T 
SAWTTNFDKIGKEFIRATPSKPSGLSLLPDYEDEEEEEEEEEGD 
GERKRGDASKPLSDAQGKEHSVSSSREVTSSVLPNDFFSTNPPK 
APIIPHSGSIEKAEIHEKWERRENTAEALPEGFFDDPEVDARV 
RKVDAPKDQMDKEWDE FQKAMRQ VNTI SEAI VAEEDEEGRLDRQ 
IGEIDEQIE CYRR VEKLRNRQDE I KNKLKE ILTIKE LQKKBEEN 
ADSDDEGELQDLLSQDWRVKGALL 


6292 


183S 


1142 


TCPGAMKM VAPWTR F YSNS C C LCCHVRTGT I LLG VW YH INAW 
LLILLSALADPDQYNFSSSELGGDFEFMDDANMCIAIAISLLMI 
LI CAMATYGAYKQRAAWI I PFFCYQI FDFALNMLVAITVLI YPN 
S I QEY IRQLP PNFP YRDDVMS VNPTCLVLI I LLF I S 1 1 LTFKGY 
LIS CVWNC YR YINGRNS SDVLVYVTSNDTTVLLP P YDDATVNGA 
AKEPPPPYVSA 


6293 


2382 

/■ 


1035 


F W CTLGT VD VH P IG WCA I NS K IL VP PRT I HAKFTDWKG YLMKRL 
VGSRTLPVDFHIKMVESMKYPFRQGMRLEWDKSQVSRTRMAW 
DTVIGGRLRLL YEDGD SDDDFW CHMWS PLIHP VGWSRRVGHGI K 
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MSERRSDMAHHPTFRKIYCDAVPYLFKKVRAVYTEGGWFEEGMK 
LE AI D P LNLGN I CV AT VCKVLLDG YLM I CVDGG P STDGLDWFC Y 
HASSHAIFPATFCQKNDIELTPPKGYEAQTFNWENYLEKTKSKA 
AP SRL FNMDCPNHGFKVGM KLEAVDLME PRLI CVATVKRWHR L 
LSI HFDGWD S E YDQ W VDCE S PD I YP VGWCELTG YQLQ P P VAAE P 
ATPLKAKEATKKKKKQFGKKRKRIPPTKTRPLRQGSKKPLLEDD 
PQGARKIS S EPVPGE 1 1 AVRVKEEHLDVASPDKAS SPELPVSVE 
NIKQETDD 




354 


1814 


AQLTTRGRTVAGGVRWI PSPFPDLELYSCCLGTDRGFPELSHHC ' 
KNV I ATAS D YDMAE I TNI RP S FDVS P WAGL I GAS VL WCVS VT 
VF VWSCCHQQAEKKHKNPP YKF I HMLKG I S I YPETLSNKKK 1 1 K 
VRRDKDGPGREGGRRNLLVDAAEAGLLSRDKDPRGPSSGSCIDQ 
LP I KMDYG EELRSP I TS LTPGES KTTS PS S P3EDVMLGS LTFS V 
DYNFPKKALWTIQEAHGLPVMDDQTQGSDPYIKMTILPDKRHR 
VKTRVLRKTLD P V FDE T FTFYG IP YSQLQD LVLH FLVLS FDRFS 
RDDVIGEVMVPLAGVDPSTGKVQLTRDIIKRNIQKCISRGELQV 
SLSYQPVAQRMTVWLKARHLQKMDIAGLSGNPYVKVNVYYGRK 
R I AKKKTHVKKCTLNP I FNES F I YD I PTDLLPDI S I E FL V I DFD 

RTTKNEWGRLILGAHSVTASGAEHWREVCESPRKPVAKWHSLS 
EY 


6295 


2795 


617 


VSS ALLTGATSGSDAAKS EGASAS PLSCTNAVAMDRPDEGP PAK 
TRRL S S SE S PQRDP P P P P p PPPLLRLPLP PPQQR PRLQEET EAA 
Q VLADMRGVGLG PAL P P P P P YV I LE EGG I RAY FTLGAE C PG WDS 
TIESGYGEAPPPTESLEALPTPEASGGSLEIDFQWQSSSFGGE 
GALETCSAVGWAPQRLVDPKSKEEAI I IVBDEDEDERESMRSSR 
RRRRRRRR KQR KVKRE SRE RNAERME S I LQALED IQLDLEAVNI 
KAGKAFLRLKRKFIQMRRPFLERRDLIIQHIPGFWVKAFLNHPR 
I S I L INRRDED I FR YLTNLQVQDLRH I SMG YKMKL YFQTN P YFT 
NMVIVKEFQRNRSGRLVSHSTPIRWHRGQEPQARRHGNQDASHS 
FFSWFSNHSLPEADRIAEIIKNDLWVNPLRYYLRERGSRIKRKK 
QEMKKRKTRGR CE WIME DAPDY YAVED I FS E I S DI DET IHD I K 
ISDFMETTDYFETTDNEITDINENICDSENPDHNEVPNNETTDN 
NESADDHETTDNNESADDNNENPEDNNKNTDDNEENPNNNENTY 
GNNFFKGGFWGSHGNNQDSSDSDNEADEASDDEDNDGNEGDNEG 
SDDDGNEGDNEG S DDDDRD I E YYE KV I EDFDKDQAD YED VI E 1 1 

SDESVEEEGIEEGIQQDEDIYEEGNYEEEGSEDVWEEGEDSDDS 
DLEDVLQVPNGWANPGKRGKTG 


6296 


727 


1199 


RHCGCDAQGACDSLPPTGTSS PVTARNAI PEARCCVWLLDGTTV 
EAVRPARERLARKELRQKRMQQFSRDSAYSSNKDSTCLLTERDT 
LGTSLQFPSPFSGTISFGSFSDSGIFPLGSQCCLGFQQFSISGK 
KWAL IHKRVRLS VFGARWGR I YFGK 


6297 


1 


922 


QRAAAASPSSCGPRGAEYGALMAMEGYWRFLALLGSALLVGFLS 
VI FALVW VLHYRE GLG WDGS ALE FNWH P VLM VTG F VF I QGIAI I 
V YRLP WTW KCS KL LMKS IHAG LNAVAAI LAI I S WAVFENHNVN 
NIANMYSLHSWVGLTAVTrYT.T.nr.T^npm/cr t,dw»di ctdtiht 

MPIHVYSGIVIFGTVIATALMGLTEKLIFSLRDPAYSTFPPEGV 
F VNTLGLL I LVFGAL I FW I VTRPQ WKRP KE PNS T I LHPNGGTEQ 
GARGSMPAYSGNNMDKSDSELNNEVAARKRNLALDEAGQRSTM 


6298 


3 


985 


SVPLRRLSLSGTLQGAGTTTKMAVARIiAAVAAWVPCRSWGWAAV 
P FGPHRGLS VLLAR I PQRAPR WLPACRQ2CTS LSFLWR PDL PNLA 
YKKLKGKS PG 1 1 F I PG YLS YMNGTKALAI EEFCKSLGHAC I RFD 
YSGVGSSDGNSEESTLGKWRKDVLS 1 1 DDLADGPQ I LVGS S LGG 
WLMLHAAI ARPE KWAL IGVATAADTLVTKFNQLPVELKKEVEM 
KGVWSMPSKYSEEGVYNVQYSFIKEAEHHCLIiHSPIPVNCPIRL 
LHGMKDDIVPWHTSMQVADRVLSTDVDVILRKHSDHRMREKADI 
QLLVYTIDDLIDKLSTIVN 
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nuuiiu aL.iu bcyiuciiL *-*Jiiuciinxng sxgnax peptide 
{A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=»Leucine, M=Methionine, N=Asparagine, 
P=Proline. 0=Glutamine R-Hraininp 
S=Serine, T=Threonine, Vs Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon. /saoossible nucleoli c3*» <; on 
\=possible nucleotide insertion) 


6299 


512 


814 


BCDLEG IM PNVT I SLS L PTNGS PLOD T T ,VH vtq t .h q a t t .t q a 

SIDAMDDSAFSGPYKFPFTPPLESFNLCFYTSQVPVPPILGFYQ 
MKEEEVQLRNNH 


6300 


121 


692 


AAP S CWSQRG VP AAGT P S S PRLL VS RAAAP S AG P WGAWRQGARA 
AQS P FS I PNSSS VPYGSQDS VHSSPEDGGGGRDRPVGGS PGGPR 
LVIGSLPAHLSPHMFGGFKCPVCSKFVSSDEMDLHLVMCLTKPR 
ITYNEDVLSKDAGECAICLEELQQGDTIARLPCLCIYHKGCIDE 
WFEVNRSCPEHPSD 


6301 


616 


284 


GKFVPVNWEPPQPLPFPKYLRCYRCLLETKEliGCLLGSDICIiTP 
AGS S C I TLHKKNS SGSDVMVSDCRS KEQMSDCSNTRTSPVS GFW 
IFSQYCFLDFCNDPQNRGLYTP 


6302 


4$0 


745 


I FGFLHLFHM EHS FLL VCAL FAHVF FS S S CG S S VALHSD P CLLS 
P VLLN CLPGD LRPLDBL YAQ KL KYKAI S EE LDHALNDMTS L 


6303 


2 


1961 


YWNEYGGGLLWQSWQEKHPGQALSSEPWNFPDTKEEWEQHYSQL - 

YWYYLEQFQYWEAQGWTFDASQSCDTDTYTSKTEADDKNDEKCM 

KVDLVS FLSS P IMGDNDSSGTSDKDHS EILDG I SNI KLNSEE VT 

QSQLDSCTSHDGHQQLSEVSSKRECPASGQSEPRNGGTNEESNS 

SGNTNTDPPAEDSQKSSGANTSKDRPHASGTDGDESEEDPPEHK 

PSKLKRSHELDIDENPASDFDDSGSLLGFKYGSGQKYGGIPNFS 

HRQVRYLEKNVKLKSKYLDMRRQIKMKNKHIFFTKESEKPFFKK 

SKI LS KVEKFLTWVNKPMDEEASQESS SHDNGHDASTSCDSEEQ 

DMSVKKGDDLLETNNPEPEKCQSVSSAGELETENYERDSLLATV 

PDEQDCOTQEVPDSRQAETEAEVKKKKNKKKNKKVNGLPPEIAA 

VPELAKYWAQRYRLFSRFDDGIKLDREGWFSVTPEKIAEHIAGR 

VSQSFKCDWVDAFCGVGGNTI QFALTGMRVI AID I D P VKI ALA 

RNNAEVYGI ADK I E FICGDFLLLAS FLKAD WFLS PP WGGPD YA 

J./U1 iv iJXKlIYlMiPlXatEIFRI^SiQCITNNIVYFLPRNADIDQVAS 

LAGPGGQVE I EQNFLNNKLKT I TAYFGDLI RRPAS ET 


6304 


1 


1438 


HRARVDRSRESPGGDLRHPGRVRRDITLSGHPRLSTQHWLLRE 
i^Cj vvjurvj i. t\jjjjyjn.r'ijfi<j^i?'HJa luorS V V ILiVai^ljPGSDMAftTjPA 
WRATSGLTLWPHTAEGRDLLGAENRALTGGQQAEDPTLASGAYQ 
WPGSVEKLQGSVWCDAETLLSSSRTGGQAPPWLTDHDVQMLRLL 
AQGEWDKARVPAHGQVLQVG FSTEAALQDLSS PRLS QLCSQGL 
CGLIKRPGDLPEVLSFHVDRVLGLiRRSLPAVARRFHSPLLPYRY 
TDGGARP V I WWAP D VQHLS D P DEDQNS LALG WLQ YQ ALLAHS CN 
WPGQAPCPGIHHTEWARLAIiFDFLLQVHDRLDRYCCGFEPEPSD 
P C VEERLRE KCRN P AE LRL VH I LVRS S DP SHL VY I DNAGNLQH P 
EDKLNFRLLEG I DG F P ES AVKVLASGCLQNMLiLKSLQMD P VFWE 
SQGGAQGLKQVLQTLEQRGQVLI^HIQKHNLTLFRDEDP 


6305 


99 


420 


NMIWRGRSTYRPRPRRSVPPPELIGPMLEPGDEEPQQEEPPTES 
RD P APGQE RE EDQGAAE TQ VPDLEADLQE LS Q S KTGDECGDG P D 
VQGKILTKSEQFKMPEGR 


6306 


1 


1874 


PTRPSKVKVPHTFLIHSYTRPTVCQACKKLLKGLFRQGLQCKDC 
KFNCHKRCATRVPNDCLGEAL INGDVPME EATDFSEADKS ALMD 
ESEDSGVI PGSHSENALHASEEEEGEGGKAQSSLGYI PLMRWQ 
SVRHTTRKS STTLREGWWHYSNKDTLRKRHYWRLDCKC I TLFQ 
NNTTNRYYKE I PLSEI LTVESAQNFSLVPPGTNPHCFE I VTANA 
TYFVGEMPGGTPGGPSGQGAEAARGWETAIRQALMPVILQDAPS 
APGHAPHRQAS LS I S VSNSQI QENVDI ATVYQ I FPDE VLGSGQF 
GWYGGKHRKTGRDVAVKVIDKLRFPTKQESQLRNEVAILQSLR 
HPGIVNLECMFETPEKVFWMEKLHGDMLEMILSSEKGRLPERL 
TKFLITQILVALRHLHFKNIVHCDLKPENVLLASADPFPQVKLC 
DFGFAR I IGEKSFRRS WGTPAYLAPEVLLNQGYNRSLDMWS VG 
VIMYVSLSGTFPFNEDEDINDQIQNAAFMYPASPWSHISAGAID 
LINNLLQVKMRKRYSVDKSLSHPWLQEYQTWLDLRELEGKMGER 
YITHESDDARWEQFAAEHPLPGSGLPTDRDLGGACPPQDHDMQG 
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LAERISVL — ^ — 


6307 


2136 


589 


C F L LPRGRD P E P PEAGAAAPCAPGAPDMS FR K WRQS KFRH VPG 
Q P VKNDQ C YED I RVS RVTWDS TFCAVN P KFLAV I VEASGGGAFL 
VLPLSKTGRIDKAYPTVCGHTGPVLDIDWCPHNDEVIASGSEDC 
TVMVWQIPENGLTSPLTEPWVLEGHTKRVGI IAWHPTARNVLL 
SAG CDNWL I WNVGTAE E L YRLDSLH PDL I YNVS WNHNGS L FCS 
ACKDKSVRIIDPRRGTLVAEREKAHEGARPMRAIFLADGKVFTT 
GFSRMSERQLALWDPENLEEPMALQELDSSNGALLPFYDPDTSV 
VYVCGKGDSSIRYFEITEEPPYIHFLNTFTSKEPQRGMGSMPKR 

PEAALEAEEWVSGRDADPILISLREAYVPSKQRDLKISRRNVLS 
DS RPAMAPGS SHLGAPASTTTAADATPSGS LARAGEAG KLEEVM 
OE LRALRAL VKFOGnR T CRT iFFOT ft R MPKFfn a 


6308 


2 


1118 


GRPTRPEKMLLSLVLHTYSMRYLLPSWLLGTAPTYVLAWGVWR '"' 

LLSAFL PAR F YQALDDR L YC VYQS MVLFFFEN YTG VQ I L L YGDL 

PKNKENIIYLANHQSTVDWIVADILAIRQNALGHVRYVLKEGLK 

WLPLYGWYFAQHGGIYVKRSAKFNEKEMRNKLQSYVDAGTPMYL 

VIFPEGTRYNPEQTfCVLSASQAFAAQRGLAVLiCHVLTPRIKATH 

VAFDCMKNYLOAIYDVTWYEGKDDGGQRRESPTMTEFLCKECP 

KIH I H I DRI DKKD VP EEQE HMRRWLHER FE I KD KM L I E F YES P D 

YVNTWIYGTLLGCLWVTIKA 


6309 


220 


563 


LVAEVKE PCSLPMLS VDMENKENGS VGVKNSMENGRPPD PADWA 
VMD WNY F RTVG FE E QAS AFQEQE I DGKS LLLMTRND VLTGLQ L 
KLGPALKIYEYHVKPLQTKHLKNNSS 


6310 


36 


973 


GPRCWKFLILSSVNCETLRIGKAWPQSSGQERYWTPRTHSSASE 

DLHLEENS FQQGMDRVQCSGDLQLAHQLQQEEDRKRRS EESRQE 
I EEFQKLQRQYGLDNSGGYKQQQLRNME I EVNRGRMP PSE FHRR 
KADMMES IiALG FDDGKTKTSGI I EALHR YYQNAATDVRRVWLSS 
WDHFHSSLGDKGWGCGYRNFQMLLSSLLQNDAYNDCLKGMLIP 
CI PKIQS M I EDAWKEGFDPQGAS QL 1 1 RLQGTKAWIGACEVYIL 
LTSLRV 


6311 


1 


675 


PVWWNS C EG PRLAAAARTGHG VG RRARLACLGE PR VKAAVMLTL 
AS KLKRDDGLKGS RTAATAS DSTRRVS VRDKLLVKE VAELEANL 

* ^- a ^tvvnr cutris xvLirl^, c vi-*- 1 1*1 ruoul iyv^tjlvf y r IS i & V PDAi S* 

MVPPKVKCLTKIWHPNITETGEICLSLLREHSIDGTGWAPTRTL 
KDVVWGLNSLFTDLJiNFDDPI^IEAAEHHLRDKEDFRNlCVDDYI 
KRYAR 


6312 


213 


1400 


GDELVKRFJVGMKMLPGVGVFGTGSSARVLVPLLRAEGFTVEALW 
GKTEEEAKQIiAEEMNIAFYTSRTDDILLHQDVDLVCISIPPPLT 
RQ I S VKALG I G KNWCEKAATS VDAFRMVTAS R YY PQLMS L VGN 
VLRFLPAFVRMKQLISEHYVGAVMICDARIYSGSLLSPSYGWIC 
DELMGGGGLHTMGTYIVDLLTHLTGRRAEKVHGLLKTFVRQNAA 
I RG I RHVTS DD FC F FQMLMGGG VCS T VTLNFNM PGAFVHE VMW 
GSAGRLVARGADLYGQKNSATQEELLLRDSLAVGAGLPEQGPQD 
VPLLYLKGMVYMVQALRQS FQGQGDRRTWDRTPVSMAAS FEDGL 
YMQSVVDAIKRSSRSGEWFAVEVLTEEPDTNQI^CEAIjQRNNL 


6313 


2 


2071 


QRSGAARLAFLPSPFS PACVHRS PLS FHGCWF YFWVFMPLG VL 
niPJUlAHGCTLSCSSFVEQPTAMEAEETMECLQEFPEKHKMILD 
RLNEQP^DRFTDITLIVDGHHFKAHKAVIiAACSKFFYKFFQEF 
TQE P LVE I EG VS KMAFRHL I EFTYTAKLM I QGEE E AND VWKAAE 
FLQMLE A I KALE VRNKENS APLEENTTGKNEAKKRKIAETSNVI 
TESLPSAESEPVEIEVEIAEGTIEVEDEGIBTLEEVASAKQSVK 
YIQSTGSSDDSALALLAJDITSKYRQGDRKGQIKEDGCPSDPTSK 
QVEGIEIVELQLSHVKDLFHCEKCNRSFKLFYHFKEHMKSHSTE 



484 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid . 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=»Tyrosine, X»Unknown, +»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SFKCEICNKRYIjRESAWKQHLNCYHLEEGGVSKKQRTGKKIHVC 
QYCEKQFDHFGHFKEHLRKHTGEKPFECPNCHERFARNSTLKCH 
LTACQTGVGAKKGRKKLYECQVCNSVFNSWDQFKDHLVIHTGDK 
PNHCTLCDLWFMQGNELRRHLSDAHNISERLVTEEVLSVETRVQ 
TE PVTSMTI IEQVGKVHVLPLLQVQVDSAQVTVEQVHPDLLQDS 
QVHDSHMSE L P EQ VQ VS YLE VGR I QTE EGTE VHVEELKVE RVNQ 
M P VE VQTELLE ADLDHVT P E I MNQEER ES SQ ADAAE AARE DHED 
AEDLE TKPT VD S E AE KAENE DRTALP VLE 


6314 


2 


2071 


QRSGAARLAFLPS PFSPACVHRSPLS FHGCWF YF WVFMPLGVL 
FHRRRAHGCTLSCSSFVEQPTAMEAEETMECLQEFPEHHKMILD 
RLNEQREQDRFTDITLIVDGHHFKAHKAVLAACSKFFYKFFQEF 
TQBPLVEIEGVSKMAFRHLIEFTYTAKLMIQGEEEANDVWKAAE 
FLQMLEAI KALEVRNKENSAPLEENTTGKNEAKKRKIAETSNVI 
TESLPSAESEPVEIEVBIAEGTIEVEDEGIETLEEVASAKQSVK 
Y IQS TGS S DDS ALALLAD ITS KYRQGDRKGQ IKEDGCPSDPTSK 
QVEGIE I VELQLSHVKDLFHCEKCNRS FKLFYHFKEHMKSHSTE 
SFKCE I CN KRY LRES AWKQHLNC YHLEEGGVS KKQRTGKK IHVC 
QYCEKQFDHFGHFKEHLRKHTGEKPFECPNCHERFARNSTLKCH 
LTACQTGVGAKKGRKKLYE CQ VCNS VFNS WDQFKDHL VI HTGDK 
PNHCTLCDLWFMQGNELRRHLSDAHN I S ERLVTEE VLSVETRVQ 
TEPVTSMTI I EQVGKVHVLPLLQVQVDSAQ VTVEQVHPDLLQDS 
QVHDSHMSELPEQVQVS YLE VGR I QTEEGTEVHVEELHVERVNQ 
MPVEVQTELLEADLDHVTPEIMNQEERESSQADAAEAAREDHED 
AEDLE TKPTVDSEAEKAENEDRTALP VLE 


6315 


1 


1015 


LGLAVNWTTLVLISYCPTATEEAPYWTYLLCALGLFIYQSLDA 
I DGKQARRTNS CS PLGELFDHGCDSLS TVFMAVGAS I AARLGT Y 
PDWFFSCSFIGMFVFYCAHWQTYVSGMLRFGKVDVTEIQIALVI 
VFVLSAFGGATMWD YTI PILE I KLKI LPVLGFLGGVI FSCSNYF 
HVILHGGVGKNGSTIAGTSVLSPGLHIGLI I ILAI MI YKKSATD 
VFEKHPCLYILMFGCVFAKVSQKLWAHMTKSELYLQDTVFLGP 
GLLFLDQ YFNNFI DE YWLWMAMVI S S FDMVI YFSALCLQ ISRH 
LHLNI FKTACHQAP EQVQVLS S KSHQNNMD 


6316 


1503 


792 


VSAGAGTG IMGGTTS TRRVTFEADENEN I TWKG I RLS ENVI DR 
MKESSPSGSKSQRYSGAYGASVSDEELKRRVAEELALEQAKKES 
EDQKRLKQAKELDRERAAANEQLTRAILRERICSEEERAKAKHL 
ARQLEEKDRVLKKQDAFYKEQLARLEERSSEFYRVTTEQYQKAA 
EEVEAKFKRYESHPVCADLQAKILQCYRENTHQTLKCSALATQY 
MHCVNHAKQSMLE KGG 


6317 


102 


839 


PEAQTSAVLAREKGHLPTMRHEAPMQMASAQDARYGQKDSSDQN 
FDYMFKLLI IGNSS VGKTS FLFRYADDS FTS AFVSTVG I DFKVK 
TVFKNEKR I KLQ I WDTAGQ ER YRTITTAYYRGAMGF I LMYD I TN 
E ES FNAVQD WSTQ I KT YSWDNAQ VILVGNKCDMEDERV I STERG 
QHLGEQLGFEFFETSAKDNINVKQTFERLVDIICDKMSESLETD 
PAITAAKQNTRLKETPPPPQPNCAC 


DjIo 


1765 


733 


PWHPLRTLPLHHPHPRPPRAEGREGADSMSHLPGLELRREAPPL 
LG PLLS P F PLPAGS WHRQMLRSS LRFP I TNS AGAP CKAAGRMNI 
LAP VRRDRVLAELP Q CLRKEAALHGHKD FH PRVTCACQEHRTGT 
VGFKISKVIWGDLSVGKTCLINRFCKDTFDKNYKATIGVDFEM 
ERFEVLG I PFSLQLWDTAGQERFKCIASTYYRGAQAI 1 1 VFNLN 
DVASLEHTKQWLADALKENDPSSVLLFLVGSKKDLSTPAQYALM 
E KDALQVAQEMKAE YWAVS SLTG ENVRE FF FR VAALTFEANVLA 
E L E KSGARR I GDWR I NSDDSNL YLTAS KKKPTCCP 


6319 


88 


717 


AATMRLNQNTLLLGKKWLVPYTSEHVPSRYHEWMKSEELQRLT 
ASEPLTLEQEYAMQCSWQEDADKCTFIVLDAEKWQAQPGATEES 
CMVGDVNLFLTDLEDLTIiGE I EVM IAEPSCRGKGLGTEAVLAML 
SYGVTTLGLTKFEAKIGQGNEPSIRMFQKLHFEQVATSSVFQEV 
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Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLRLTVSESEHQWLLEQTSHVEEKPYRDGSAEPC 


6320 


90 


1111 


RPRTGREKVAMAAVDSFYLLYREIARSCNCYMEALALVGAWYTA 
R KS I TV I CDFYSL I RLHF I PRLGSRADL I ICQ YGR WA WSGATDG 
I G KAYAE E LAS RGLN I 1 L I SRNEE KLQWAKD I ADT YKVE TD 1 1 
VADFSSGRE I YLP IREALKDKDVGILVNNVGVFYPYPQYFTQLS 
EDKLWDIINVNIAAASLMVHVVLPGMVERKKGAIVTISSGSCCK 
PTPQLAAFSASKAYLDHFSRALQYEYASKGIFVQSLIPFYVATS 
MTAPSNFLHRCSWLVPSPKVYAHHAVSTLGISKRTTGYWSHSIQ 
FLFAQ YMPEWLWVWGAN I LNRSLRKEALS CTA 


6321 


141B 


341 


HRKAAI^AIJ4AGRLLGKALAAVSLSLiAIASVTIRSSRCRGIQAF 
RNSFSSSWFHLNTNVMSGSNGSKENSHNKARTSPYPGSKVERSQ 
VPNEKVGWLVEWQDYKPVEYTAVSVLAGPRWADPQISESNFSPK 
FNEKDGHVERKSKNGLYEIENGRPRNPAGRTGLVGRGLLGRWGP 
NHAADPIITRWKRDSSGNKIMHPVSGKHILQFVAIKRKDCGEWA 
IPGGMVDPGEKISATLKREFGEBALNSLQKTSAEKREIEEKLHK 
LFS QDHLV I YKGYVDDPRNTDNAWMETEAVNYHDETGEIMDNLM 
LEAGDDAGKVKWVDINDKLKLYASHSQFIKLVAEKRDAHWSEDS 
EADCHAL 


£322 


2047 


1083 


NQEILKNVESSRTVQPHFLEFLLSLGWSVDVGRHPGWTGHVSTS 
WS INCCDDGEGSQQEEVI SSEDIGAS I FNGQKXVLYYADALTEI 
AF WP S P VE S LTDS LESN I S DQD SDSNMDLMP G IIiKQPSLTLEL 
FPNHTDNLNS S QRLS PS S RMRKL PQGR P VP PLGPE TRVS WWVE 
RYDD I ENFPLSELMTE I STGVETTANS STSLRSTTLEKE VPVI F 
I H P LNTGLFR I KI QGATG KFNM VI P LVDGM I VSRRALG FLVRQT 
VINICRRKRLESDSYSPPHVRRKQKITDIVNKYRNKQLEPEFYT 
SLFQEVGLKNCSS 


6323 


1 


656 


PAS TTDGAQ EARVPLDG AFW I PRP PAG S PKGC FAC VS KP PALQA 
PAAPAPEPSASPPMAPTLFPMESKSSKTDSVRAAGAPPACKHLA 
E KKTMTNP TTV I E VYP DTTE VND Y YLW S I FNF VYLNFC CLGF I A * 
LAYS LKVRDKKLLNDLNGAVEDAKTDRLINITRSGLAAS C IMLW 
MALS VIATHRGLRSSAS I LVAEPHDWNTERPQVTFRERCPAL 


6324 


1 


2061 


EGAGMRRCPCRGSLNEAEAGALPAAARMGLEAPRGGRRRQPGQQ 
RPGPGAGAPAGRPEGGGPWARTEGSSLHSEPERAGLGPAPGTES 
PQAEFWTDGQTEPAAAGLGVETERPKQKTEPDRSSLRTHLEWSW 
SELGTTCLWTETGTDGLWTDPHRS DLQFQPEEAS P WTQPGVHGP 
WTE LETHG S QTQ PER VKS WADNLWTHQNS S S LQTH P EGACP S KE 
PSADGSWKELYTDGSRTQQDIEGPWTEPYTDGSQKKQDTEAARK 
QPGTGGFQIQQDTDGSWTQPSTDGSQTAPGTDCLLGEPEDGPLE 
EPE PGEL LTHLYSHLKCS PLCPVPRLI I TPETPE PEAQPVGPPS 
RVEGGSGGFSSASSFDESEDDWAGGGGASDPEDRSGSKPWKKL 
KTVLKYS PFWS FRKHYPWVQLSGHAGNFQAGEDGR I LKRFCQC 
EQRSLEQLMKDPLRPFVPAYYGMVLQDQQTFNQMEDLLADFEGP 
9IMDCKMGSRTYLEEELVKARERPRPRKDMYEKMVAVDPGAPTP 
E EHAQG AVTKP R YMQWRE TM S S TSTLG FR I EG I KKADGTCNTN F 
KKTQALEQ VT KVLEDFVDGDHVI LQK YVACLEELRE ALE I S PF F 
KTHBWGSSLLFVHDHTGLAKVWMIDFGKTVALPDHQTLSHRLP 
WAEGNREDGYLWGLDNMI CLLQGLAQS 


£325 


165 


944 


GLRDPFRRKRRLKPQVKMSNYVNDMWPGSPQEKDSPSTSRSGGS 
SRLSSRSRSRSFSRSSRSHSRVSSRFSSRSRRSKSRSRSRRRHQ 
RKYRRYSRSYSRSRSRSRSRRYRERRYGFTRRYYRSPSRYRSRS 
RSRSRSRGRSYCGRAYAIARGQRYYGFGRTVYPEEHSRWRDRSR 
TRSRSRTPFRLSEKDRMELLEIAKTNAAKALGTTNIDLPASLRT 
VPSAKETSRG IGVSSNGAKPEVS ILGLSEQNFQKANCQ I 


6326 


23B 


680 


GE PS P ATQQKP S ATGAGVLHQHFSSGH I YVLMGLL P P P WT I S FT 
VQTTLQPPGGLPAAPVSGRMAFEPVGRDLARRMVPRAQKRTQTL 
GARRVAAQGARPLPEDRRPKSGERLHVTVAPCWEFVLPSVSLTA 
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Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
Paproline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X-Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QAWGG VGQEAS SG VP 


6327 


1 


1337 


S LARLAPAGGS VVMP TQQPAAP S TRAPKP SRSLSGS LCAL FS DA 
DSGSGM KAE LPPGPGAVGREM TKE EKLQLRKEKKQQ KKKRKEE K 
GAE PE TGS AVS AAQCQG PTRE L PE SG I QLGTPRE KVPAGRS KAE 
LRAERRAKQE AERAL KQARKGEQGGP P P KAS PSTAGETPSG VKR 
LPEYPQVDDLLLRRLVKKPERQQVPTRKDYGSKVSLFSHLPQYS 
RQNS LTQFMS I PS S VI HPAMVRLGLQ YS QGL VRGSNARCIALLR 
ALQQVIQDYTTPPNEELSRDLVNKLKPYMSFLTQCRPLSASMHN 
AI KFLNKE I TS VGS S KREEEAKSELRAAI DRYVQE KI VLAAQAI 
S RFAYQK I SNGDVI LVYGCS S LVS RI LQ EAWTEGRR FRVWVDS 
RPWLEGRHTLRSLVHAGVPASYLLIPAASYVLPEVSTEEKDSKV 
GGEKV 


6328 


1030 


276 


HASAEVTTAAARGLGAMEEEMHTDAKIRAENGTGSSPRGPGCSL 
RHFACEQNLLSRPDGSASFLQGDTSVLAGVYGPAEVKVSKE I FN 
KATLEVILRPKIGLPGVAEKSRERLIRNTCEAWLGTLHPRTSI 
TWLQWSDAGSLLACCLNAACMALVDAGVPMRALFCGVACALD 
SDGTLVLDPTSKQEKEARAVLTFALDSVERKLLMSSTKGLYSDT 
ELQQCLAAAQAASQHVFRFYRESLQRRYSKS 


6329 


3 


2016 


SSEVAAGGGTRSAMAEGSGEWTVSATGAANGLNNGAGGTSATT 
SNPLSRKLHKILETRLDNDKEMLEALKALSTFFVENSLRTRRNL 
RGDIERKSLAINEEFVSIFKEVKEELESISEDVQAMSNCCQDMT 
SRLQAAKEQTQDLI VKTTKLQSESQKLB IRAQVADAFLS KFQLT 
SDEMSLLRGTREGPITEDFFKALGRVKQIHNDVKVLLRTNQQTA 
GLEIMEQMALLQETAYERLYRWAQSECRTLTQESCDVSPVLTQA 
MEALQDRPVLYKYTLDEFGTARRSTWRGFIDALTRGGPGGTPR 
PIEMHSHDPLRYVGDMLAWLHQATASEKEHLEALLKHVTTQGVE 
ENI QE WGH I TEGV CR PLKVR I EQ VI VAE PGAVLL YK I S NLLKF 
YHHTISGIVGNSATALLTTIEEMHIiLSKKIFFNSLSIiHASKLMD 
KVELPPPDLGPSSALNQTLMLLREVLASHDSSWPLDARQADFV 
QVLS CVLD P LLQMCTVS ASNLGTADMAT FMVNS LYMMKTTLAL F 
E FTDRRLE MLQ FQ I EAHLDTL I NE QAS YVLTRVGLS Y I YNTVQQ 
HKPEQGSLANMPNLDS VTLKAAMVQFDRYLSAPDNIiLI PQLNFL 
LSATVKEQIVKQSTELVCRAYGEVYAAVMNPINEYKDPENILHR 
SPQQVQTLLS 


6330 


1151 


333 


FFYYTFYENKTFSRKMVAEKETLSLNKCPDKMPKRTKLLAQQPL 
PVHQPHSIiVSEGFTVKAMMKNSWRGPPAAGAFKERPTKPTAFR 
KFYERGDFP I ALEHDS KGNKIAWKVE IE KIiD YHH YL PL FFDGLC 
EMTFP YEFFARQGIHDMLEHGGNKILPVLPQLI I PIKNALNLRN 
RQVI CVTLKVLQHLVVS AEMVGKALVP YYRQ ILPVLNI FKNMNV 
NSGDG I DYSQQKRENI GDL IQETLEAFERYGGENAF INI KYWP 
TYESCLLN 


6331 


3 


495 


QQGQRVRTRGRRACAS ATP L EX3C VDLS YPRTHAAIiLKVAQMVTL 
L I AF I CVRSSLWTNYSAYS YFEWTI CDL IM ILAFYLVHLFRFY 
RVLTCISWPLSELLHYLIGTLLLLIASIVAASKSYNQSGLVAGA 
I FGFMATFLCMASIWLSYKISCVTQSTDAAV 


6332 


1 


678 


. VTE SNKFDLVS FI PLLRERI YSNNQYARQF I IS WILVLES VPDI 
NL LD YLPE I LDGLFQ I LGDNGKE I RKMCEWLGE FLKE I KKNPS 
SVKFAEMANIIjVIHCQTTDDLIQLTAMCWMREFIQLiAGRVMLPY 
SSGILTAVLPCLAYDDRKKSIKEVANVCNQSLMKLVTPEDDELD 
ELRPGQRQAEPTPDDALPKQEGTASGEWTPSLHLTSCRGPREPD 
V IG VALGPHLS NQD Y FM YVTHTI VAATQRSGS SGS P PFCRQDTG 
KLSTMATHSQLVKTGTGLE PRQAVS S SH 


6333 


3 


1467 


TRT P S EAEAGGE S PQS CVSAAHS D WTAG KP VS LLAP LIP PRS AG 
QPLTFSPSGRQPLRSLLVGMCSGSGRRRSSLSPTMRPGTGAERG 
GLMMGHPGMHYAPMGMHPMGQRANMPPVPHGMMPQMMPPMGGPP 
MGQMPGMMSSVMPGMMMSHMSQASMQPALPPGVNSMDVAAGTAS 
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Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 

P — P yn] "trip O— 'f^T UfaTYHno D 7\ v-rr i n i na 

S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








vj^vxvarjyv x i.nivaruoK i I x liN 1 IS 1 J\yo 1 WKrUriJlJijl\l irAJiUJLiJjSK 

CPWKEYKSDSGKPYYYNSQTKESRWAKPKELEDLEGYQNTIVAG 
SL I TKSNLHAM I KAEESS KQEECTTTSTAPVPTTE I PTTMSTMA 
AAEAAAAWAAAAAAAAAAAAANANASTSASNTVSGTVPVVPEP 

EVTQT\/2X'T T WnMP , N r r\7 r PT CTPPOfifYr/T'CTDTV Torw^oircTrc? okptvi 
ci v j. aivrti v v ua EjLN iviioi HCyAyii JL o 1 f J\L y U\^t> V ci V b orJ i\j 

EETSKQETVADFTPKKEEEESQPAKKTYTVTNTKEEAKQAFKELL 
KEfCRVPSNAS WEQAMKMI INDPRYSALAKLSE KKQAFNAYKVQT 
EKK 


6334 


- 17 


644 


GGNPSGRAAGFAAAAMPSSPLRVAWCSSNQNRSMEAHNILSKR 
\3CO vx\orv3 1VJ iflvlNjjlroiriir'iJAh'W V x L/r J\l 1 IIJUIYIx^UIjIjKKijK 

ELYTQNGILHMLDRNKRIKPRPERFQNCKDLFDLILTCEERVYD 
QWEDLNSREQETCQPVHWNVDIQDNHEBATLGAFLICELCQC 
I QHTE DMENE I DELLQE F EE KSGRTFLHT VC FY 


6335 


82 


529 


Ek ft O TV *D D/TlX 7T . T T .O >\ 7\ t /*1 nr\CDT rri mf £T w t ^ m rrmv t tt r/\n^ »i-i^t« 1 

HAKAKrUVIjv^KJjJjvjAALGiJUb 

KLRQGENIjILGFS IGGG IDQDPSQNPFSEDKTDKGI YVTRVSEG 
GPAEIAGLQIGDKIMQVNGWDMTMVTHDQARKRLTKRSEEVVRL 
LVTRQSLQKAVQQSMLS 


6336 


1003 


438 


hepaskgraevgnmrlsvaaaishgrvfrrmglgpesrihllrn 
lltglvrherieapwarvdemrgyaeklidygklgdtneramrm 
adfwltekdlipklfqvlaprykdqtggytrmlqipnrsldrak 
mavieykgnclpplplprrdshltllnqllqglrqdlrqsqeas 
nhsshtaqtpgi 


6337 


76 


524 


EGIQMLSVQPDTKPKGCAGCNRKIKDRYLLKALDKYWHEDCLKt 
ACCDCRIjGEVGSTLYTKANLILCRRDYLRLFGVTGNCAACSKLI 
P AF EM VM RAKDNVYHLDC FACQLCNQRF CVGDKF FLKNNM I LCQ 
TDYEEGLMKEGYAPQVR 


6338 


66 


1349 


APNS E SGTQG PL PT PANL FWTRRAN PD PTTSM S ATDRMG P KAVP 
GLRLALLLLLGLGTPKSGVQGQEGLDFPEYDGVDRVINVNAKNY 
KNVFKKYEVLALLYHEPPEDDKASQRQFEMEBLILELAAQVLED 
KGVGFGLVDS E KDAAVAKKLGLTEVDS M YVFKGD E VI E YDGE FS 
ADTI VE FLLDVLEDPVELI EGERELQAFENIEDE I KLIGYFKS K 
DSEHYKAFEDAAEEFHPYIPFFATFDSKGAKKLTLKLNEIDFYE 
AFMBE P VTI PDKPNS E EE I VNFVEEHRRSTLRKLKP ES M YETWE 
DDMDG IH I VAFAE E ADPDG FE FLETL KAVAQDNTENP DLS I IWI 
DPDDFPLLWYWEKTFDIDLSAPQIGVVNVTDADRLWMEMDDEE 


6339 


246 


1813 


NRCDRGGGGQAERQAGQGCRTQGAGPGFGFGHSFFSQGAMKAFH 
TFCWLLVFGSVSEAXFDDFEDEEDIVEYDDNDFAEFEDVMEDS 
VTESPQRVIITEDDEDETTVELEGQDENQEGDFEDADTQEGDTE 
S EP YDDEE FEGYEDKPDTS S S KNKDP I T I VD VPAHLQNS WES YY 

VGDDGTNKEATSTGKLNQENEHIYNLWCSGRVCCEGMIjIQLRFL 
KRQDLLNVLARMMRPVSDQVQ I KVTMNDEDMDTYVFAVGTRKAL 
VRLQKEMQDLSEFCSDKPKSGAKYGLPDSLAILSEMGEVTDGMM 
DTKMVHFLTHYADKIESVHFSDQFSGPKIMQEEGQPLKLPDTKR 
TLLLT FNVPGSGNT YP KDME ALL P LMNMV I YS I D KAKKFRLNRE 
GKQKADKNRARVBENFLKLTHVQRQEAAQSRREEKKRAEKER IM 
NEEDPEKQRRLEEAALRREQKKLEKKQMKMKQIKVKAM 


4340 


2 


583 


EACAHTLSCPAFARLGRARRRPWMSHRTSSTFRAERSFHSSSSS 
S3SSTSSSASRALPAQDPPMEKALSMFSDDFGSFMRPHSEPLAF 
PARPGGAGN I KTLGDAYE FAVD VRDF S PED 1 1 VTTSNNHIEVRA 
EKLAADGTVMNNFAHKCQLPEDVDPTSVTSALREDGSLTIRARR 
HPHTEHVQQTFRTEIKI 


6341 


2 


£4 5 


KMAVLSAPGLRGFRILGLRSSVGPAVQARGVHQSVATDGPSSTQ 
PALPKARAVAPKPS SRGE YWAKLDDLVNWARRSS LWPMTFGLA 
CCAVEMMHMAAP R YDMDR FGW FRAS PRQ S D VM I VAGTLTNKMA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=»Glycine, 
HsHistidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S*Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PALRKVYDQMPEPRYWSMGSCANGGGYYHYSYSVVRGCDRIVP 
VDIYIPGCPP TAEALL YG I LQLQRK I KRERRLQ I W YRR 


6342 


2 


1191 


D PRVRAM LATLARVAALR KTCL FSGRGGGRGLWTGR PQS DMNN I 
KPLEGVKI LDLTRVLAG PFATMNLGDLGAEVI KVERPGAGDDTR 
TWGP PFVGTES TYYLS VNRNKKS I AVN I KDPKGVKI I KE LAAVC 
DVFVENYVPGKLSAMGLGYEDIDE IAPHI I YCS ITGYGQTGPIS 
ORAGYDAVASAVSGLMHTTnPEVAfT.ciHTaaMVT TnnvT?zvirDijr» 
TAHG S I VP YQ AF KTKDG Y I WG AGNNQQ PATVC KI LDLPE LI DN 
S K YKTNHLRVHNRKE L I KI LS ERFEE ELTS KWL YLFEGS G VP YG 
PINNMKNVFAEPQVLHNGLVMEMEHPTVGKISVPGPAVRYSKFK 
M S EARPP PLLGQHTTH I LKE VLR YDDRA I GELL S AG WDQHETH 


6343 


2 


936 


GTAMVSDEDELNLLVIWDANPIWWGKQALKESQFTLSKCIDAV ' 
M VLGNSHL FMNRSNKLAV I AS H I Q ES RFL YPG KNGRLGD F FGDP 
GNPPEFNPSGS KDGKYELLTSANEVI VEE I KDLMTKSD I KGQHT 
ETLLAGSLAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSALQ 
YMNFMNVI FAAQ KQN I L I DACVLDS DS GLLQQACD ITGGL YLKV 
PQMPSLLQYLLWVFLPDQDQRSQLILPPPVHVDYRAACFCHRNL 
IBIGYVCSVCZ.SIFCNFSPICTTCETAFKISLPPVLKAKKKKLK 
VSA 


6344 


2508 


147 


TMPTATI/3NLRG YGMAS PGLAAPS LTP PQLATPNLQQFF PQATR 
QSLLGPPPVGVPMNPSQFNLSGRNPQKQARTSSSTTPNRKDSSS 
QTMPVEDKSDPPEGSEEAAEPRMDTPEDQDLPPCPEDIAKEKRT 
P APE PE PCEAS E L P AKRLRS SEEPTEKEP PGQLQVKAQPQ ARMT 
VP KQTQTPDLL PE ALEAQVLPRFQ PRVLQVQAQVQ S QTQ PR I PS 

PQVQPQAHSQGPRQVQLQQEAEPLKQVQPQVQPQAHSQPPRQVQ 
LQLQKQVQTQT YPQ VHTQAQ PS VQ PQEH P PAQVS VQ P PEQTHEQ 
PHTQ P Q VS LLAPEQT PVWH VCX3LEMP PDAVE AGGGMEKTLP EP 
VGTQVSMEEIQNESACX3LDVGECENRAREMPGVWGAGGSLKVTI 
LQSSDSRAFSTVPLTPVPRP9DSVSSTPAATSTPSKQALQFFCY 
ICKASCSSQQEFQDHMSEPQHQQRLGEIQHMSQACLLSLLPVPR 
DVLETEDEEPPPRRWCNTCQLYYMGDLIQHRRTQDHKIAKQSLR 
PFCTVCNRYFKTPRKFVEHVKSQGHKDKAKELKSLEKEIAGQDE 
DHFITVDAVGCFEGDEEEEEDDEDEEEIEVEEELCKQVRSRDIS 
REEWKGS E TYS PNTAYGVDFLVPVMG Y I CRI CHKF YH SNSGAQL 
SHCKSLGHFENLQKYKAAKNPSPTTRPVSRRCAINARNALTALF 
TSSGRPPSQPNTQDKTPSKVTARPSQPPLPRRSTRLKT 


€1345 


2 


3483 


PRVRTKLILLVNDKKRYERVGGGPKRLGRDVEMEEMIEQLQEKV 
HELEKQNDTLKNRL I SAKQQLQTQGYRQTPYNNVQSR INTGRRK 
ANENAGLQECPRKG I KFQDADVAETPHPMFTKYGNSLLEEARGE 
IRNLENVI QSQRGQ I EELEHLAE I LKTQLRRKENE I ELSLLQLR 
EQQATDQRSNIRDNVEMIKLHKQLVEKSNALSAMEGKFIQLQEK 
QRTLK1SHDALMANGDELNMQLKEQRLKCCSLEKQLHS^4KFSER 
RIEELQDRINDLEKERELLKENYDKLYDSAFSAAHEEQWKLKEQ 
QLKVQIAQLETALKSDLTDKTEILDRLKTERDQNEKLVQENREL 
QLQYLEQKQQLDBLKKRIKLYNQENDINADELSEALLLIKAQKE 
QKNGDLSFLVKVDSEINKDLERSMRELQATHAETVQELEKTRNM 
L IMQHKINKDYQMEVEAVTRKMENLQQDYELKVEQYVHLLD I RA 
ARIHKLEAQLKDIAYGTKQYKFKPEIMPDDSVDEFDETIHLBRG 

enlfeihinkvtfssevlqasgdkepvtfctyafydfelqttpv 
vrglhpeynftsqylvhvndlflqyiqkntitlevhqaysteye 
tiaacqlkfheileksgrifctasligtkgdipnfgtveywfrl 

RVPMDQAIRLYRERAKALGYITSNFKGPEHMQSLSQQAPKTAQL 
SSTDSTDGNIjNELHITIRCCNHLQSRASHLQPHPYWYKFFDFA 
DHDTAI I PS SNDPQ FDDHM YF P VPMNMDLDR YL KS E S LS FYVFD 
DSDTQENIYIGKVNVPLISLAHDRCISGIFELTDHQKHPAGTIH 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=*Alanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=»Valine, 
W=Tryptophan, Y«Tyrosine, X»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








V I L KW KFA YL P P S GS I TTEDLGNF IRS EE PBWQRLP PAS S VST 
LVLAPRPKPRQRLTPVDKKVS FVD I MPHQSDVS QEGSVDEVKEN 
TEKMQQGKDDVSLLSEGQLAEQSLASSEDETEITEDLEPEVEED 
MSASDSDDCI I PGPISKNI KQPSEKI RIEI IALSLNDSQVTMDD 
TIQRLFVECRFYSLPAEETPVSLPKPKSGQWVYYNYSNVIYVDK 
ENNKAKRDILKAILQKQEMPNRSLRFTWSDPPEDEQDLECEDI 
GVAHVDLADMFQEGRDLIEQNIDVFDARADGEGIGKLRVTVEAL 
HALQSVYKQYRDDLEA 


6346 


2921 


533 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSALTPSIWPQEIL 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDK IAVS LPRS EKLRSLVLA 
G I PHGMRPQLWMRLSGALQKKRNS ELS YREI VKNS SNDET I AAK 
QIBKDLLRTMPSNACFASMGSIGVPRLRRVLRALAWLYPEIGYC 
QGTGMVAACLLLFLEEEDAFWMMSAIIEDLLPASYFSTTLLGVQ 
TDQR VLRHL I VQ YL PRLD KLLQEHD I ELS LI TLHW FLTAFAS W 
DIKLLLRIWDLFFYEGSRVLFQLTLGMLHLKBEELIQSENSASI 
FNTLSDI P SQMEDAELLLGVAMRLAG S LTD VAVETQRRKHLAYL 
IADQGQLLGAGTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 
KAKNIKQTELVADLREAILRVARHFQCTDPKNCSVVSRQLPGLL 
PNTALTP PTPLVGL YSLWQELTPDYS ME SHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKND I I T I VS QKDEHCWVGELNGL 
RGWFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKT FRLDEDGKVLTPEELLYRAVQS VNVTHDAVHAQMDVKLRS L 
I CVGLNEQ VLHLWLE VLCS S L PT VE KW YQ PWS FLRS PGWVQ I KC 
ELR VL C C F AFS LSQD WEL P AKRE AQ Q PLKEGVRDMLVKHHL FS W 
DVDG 


6347 


2921 


533 

• 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSALTPS I WPQE I L 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPQRLRWQAHLE FTHNHDVGDLTWDKI AVS LPRS EKLRSLVLA 
GI PHGMRPQLWMRLSGALQKKRNSELS YREI VKNS SNDETI AAK 
QIEKDLLRTMPSNACFASMGSIGVPRLRRVLRALAWLYPEIGYC 
QGTGM VAAC LLLFL EEEDAFWMMS A 1 1 EDLLPAS YFSTTLLGVQ 
TDQRVLRHL I VQ YL PRLDKLLQEHD I EL S L I TLHW FLTAFAS W 
D I KLLLR I WDLFF YEGS RVLFQLTLGMLHLKEEELIQSENSAS I 
FNTLSDI PSQMEDAELLLGVAMRLAGSLTDVAVETQRRKHLAYL 
IADQGQLLGAGTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 
KAKNIKQTELVADLREAILRVARHFQCTDPKNCSWSRQLPGLL 
PNTALTP PT PLVGL YS LWQE LTPD YSMESHQRDHENYVACSRS H 
RRRAKALLDFERHDDDELGFRKNDIITIVSQKDEHCWVGELNGL 
RGWFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTP EE LLYRAVQS VNVTHDAVHAQMDVKLRS L 
ICVGLNEQVLHLWLEVLCSSLPTVEKWYQPWS FLRS PGWVQIKC 
ELRVLCCFAFSLSQDWELPAKREAQQPLKEGVRDMLVKHHLFSW 
DVDG 


6348 


3 


3679 


AGAEKCFVTLLACFLAKQQNKYKYEECKDL I KSMLRNELQFKEE 
KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
E HLQALLTPDE PDKS QGQDLQEQLAEGCRLAQHL VQKLS P ENDN 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
E KKQQ FRNLKE KCFLTQLACF LANQQNK Y KYE EC KDL I KFM LRN 
ERQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 
DAS RSLNEHLQALLTPDE PDKS QGQDLQEQLAEGCRLAQHL VQK 
LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NaAsparagine, 
P* Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 

BDAVHIIPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 

YS TLS I P P EMLAS YKS YS S TFHS LEEQQ VCMAVDIGRHRWDQVK 

KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 

CQP YRSAF YVLEQQRVGIjAVNMDE I EKYQE VEEDQDPS C PRLSR 

BLLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 

QYLGLiALDVDRIKKDQEEEEDQGPPCPRLSREliLEVVEPEVLQD 

SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFSLDVGEIE 

KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 

PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 

DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 

TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 

DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 

PYSSAVYSLEEQYI/3LALDVDRIKKDQEEEEDQGPPCPRLSREL ' 

LEVVEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 

VGFS LDVGE I EKKGKGKKRRGRRS KKERRRGRKEGEEDQNP PCP 

RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 

S FE E EH I S FAL Y VDNR FFTLTVTS LHL VFQMG V I F PQ 


6349 


3 
■ 


3^79 


AGAEKCFVTLLACFLAKQQNKYKYEECKDLIKSMLRNELQFKEE 

KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 

EHLQALLTPDEPDKSQGQDLQEQIiAEGCRLAQHLVQKLSPENDN 

DDDEDVQVE VAE KVQKSS S PREMQKAE E KEVPEDS LEE CA I TCS 

NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 

I LP VPGPTS SATNVS MWS AG PLSGE KAAI N I L E I NEKLRPQLA 

EKKQQFRNLKEKCFLTQLACFLANQQNKYKYEECKDLIKFMLRN 

ERQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 

DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 

LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 

ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTUGSSSHVEW 

EDAVHIIPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 

YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 

KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 

CQPYRSAFYVLEQQRVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 

QYLGLALDVDR I KKDQEEBEDQGPPCPRLSRELLEWEPEVLQD 

SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFS LDVGE IE 

KKGKGKKRRGRRSKKERRRGRKEGEEPQNPPCPRLSRELLDEKG 

PE VLQDS LDRCYS TPSGCLE LTDS CQP YRS AFY I LEQQ R VGLAV 

DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 

TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 

DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 

PYSSAVYSLEEQYLGLALDVDRIKKDQEEEEDQGPPCPRLSREL 

LE WE P EVLQ DS LDRC YST P S S CLEQP D S CQP YGS S F YALEEKH 

VGFSLDVGE I EKKGKGKKRRGRRS KKERRRGRKEGEEDQNPPCP 

n ujs z> numci vac, ft, v Liy uts Liu 1 u X S TPSMYFEL PDS FQHYRS VF Y 

S FEE EH I S FALYVDNRFFTLTVTSLHLVFQMGVI FPQ 


6350 


3 


3679 


AGAEKCFVTLLACFLAKQQNKYKYEECKDLIKSMLRNELQFKEE 
KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
EHLQALLTPDEPDKS QGQDLQEQLAEGCRLAQHLVQKLS PENDN 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
E KKQQFRNLKEKC FLTQLAC FLANQQNK YK YEECKDL I KFM LRN 
ERQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LS P ENDNDDDED VQVEVAE KVQKS S APREM P KAEEKEVP EDS LE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, ]>Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W^Tryptophan, Y«Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHI I PENESDDEEEEEKGPVS PRNLQESEEEEVPQESWDEG 
YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 
KE DHEATG PRLSRE LLDE KG P E VLQDS LDRC YSTPSGCLELTDS 
CQ P YRS AF YVLEQQ RVG LAVNMDE I EKYQE VEEDQD P S CPRLS R 
ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 
QYIK3LALDVDRIKKDQEEEEDQGPPCPRLSRELLEVVEPEVLQD 
SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFSLDVGBIE 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNP PC PRLSRE LLDE KG 
P E VLQDS LDR CYST P SG CLELTDS CQP YRS AF YI LE QQR VG LAV 
DMDE I EKYQE VEEDQD PSCPRLSGE LLDE KE PE VLQES LDRC YS 
TPSGCLE LTDS CQP YRS AFYI LEQQRVGLAVDMDE I E KYQEVEE 
DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 
PYSSAVYSLEEQYLGLALDVDRIKKDQEEEEDQGPPCPRLSREL 
LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 
VGFSLDVGE I E KKGKGKKRRGRRS KKERRRGRKEGEEDQNPPC P 
RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
S FEEEH I S FAL YVDNRFFTLTVTS LHLVFQMGVI F PQ 


6351 


1291 


319 


REARRRTERSQLGRMLWEVANGRSLVWGAEAVQALRERLGVGG 
RTVGALPRGPRQNSRLGLPLLLMPEEARLLAEIGAVTLVSAPRP 
DSRHHSLALTSFKRQQEESFQEQSALAAEARETRRQELLEKITE 
GQAAKKQKLEQASGASSSQEAGSSQAAKEDETSDGQASGEQEEA 
GPSSSQAGPSNGVAPLPRSALLVQLATARPRPVKARPLDWRVQS 
KDWPHAGRPAHELRYSIYRDLWERGFFLSAAGKFGGDFLVYPGD 
PLRFHAHYIAQCWAPEDTIPLQDLVAAGRLGTSVRKTLLLCSPQ 
PDGKWYTSLQWASLQ 1 


6352 


235 


923 


WSBWLSPCHAAKCKGLSMLRITMKTRAISLAADATEFVQGRSAP 
AMARSLVHDTVFYCLSVYQVKISPTPQLGAASSAEGHVGQGAPG 
LMGNMNP EGGVNH3NGMNRDGGM I PEGGGGNQEPRQQPQPPPEE 
PAQAAMEGPQPENMQPRTRRTKFTLLQVEELESVFRHTQYPDVP 
TRRELASNLGVTEDKVRVWFKNKRARCRRHQRELMLANEl^ADP 
DDCVYIWD 


6353 


65 


672 


RFAGAGAI PEARARPPD VQAAEEB KEMDLPDSASRVFCGR I LSM 
VNTDDVNAIILAQKNMLDRFEKTNEMLLNFNNLSSARLQQMSER 
FLHHTRTLVEMKRDLDS I FRRIRTLKGKLARQHPEAFSHI PEAS 
FLEEEDEDPI PPSTTTT I ATS EQS TGS CDTSPDTVS PSLS PG FE 
DLSHVQPGSPAINGRSQTDDEEMTGE 


6354 


965 


510 


PSLRPMEPTRDCPLFGGAFSAILPMGAIDVSDLRPVPDNQEVFC 
HP VTDQSL I VELLELQAHVRGEAAARYHFEDVGGVQGARAVHVE 
S VQPL S LENLALRGRCQEAWVLSG KQQ I AKENQQVAKDVTLHQA 
LLRLPQYQTDLLLTFNQPP 


6355 


158 


1662 


RGSSAAFRGSGLRGAMIRRVLPHGMGRGLLTRRPGTRRGGFSLD 
WDGKVSEIKKKIKSILPGRSCDLLQDTSHLPPBHSDWIVGGGV 
LGLSVAYWLKKLESRRGAIRVLWERDHTYSQASTGLSVGGICQ 
QFSLPENIQLSLFSAS FLRNINEYLAWDAPPLDLRFNPSGYLL 
LAS EKDAAAMESNVKVQRQEGAKVSLMS PDQLRNKFPWINTEGV 
ALASYGMEDEGWFDPWCLLQGLRRKVQSLGVLFCQGEVTRFVSS 
S QRMLTTDD KAWL KR I HEVHVKMDRSLE YQ PVECA I VINAAGA 
WSAQIAALAGVGEGPPGTLQGTKLPVEPRKRYVYVWHCPQGPGL 
ETPLVADTSGAYFRREGLGSNYLGGRSPTEQEEPDPANLEVDHD 
FFQDKVWPHLALRVPAFETLKVQSAWAGYYDYNTFDQNGWGPH 
PLWNM YFATGFSGHGLQQAPG IGRAVAEMVLKGRFQT I DLS PF 
LFTRFYLGEKIQENNI I 


6356 


354 


633 


TGLTSSCLPLQVMMTKRTKDMGKFSSVTVSTIDEEEEEIEAREV~~ 

ADSYAQNAKVIEKQLERKGMSKRRLQELAELEAKKAKMKGTLID 

NQFK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
Luiieaponaing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G-Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine , M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6357 


2 


915 


GLLRNMALLVRVLRNQTS I SQ WVPVCS R L I P VS PTQGQGDRALS 
RTSQWPQMSQSQACGGSEQIPGIDIQLNRKYHTTRKLSTTKDSP 
y e V fc.fi, KVLiAr L K.I IEAMGr TGPXjKY SKWKI KIAALRMYTS CVEK 
TD FEE FFLRCQM PDT FNS WFL I TLLHVWMCLVRMKQEGRS GKYM 
CR 1 1 VHFMWEDVQQRGRVMGVNP YI LKKNM I LMTNHFYAAILGY 
DEGILSDDHGLAAALWRTFFNRKCEDPRHLELLVEYVRKQIQYL 
DSMNGEDLLLTGEVSWRPLVEKNPQSILKPHSPTYNDEGL 


6358 


2009 


1040 


ASDALHSLSAPVLRLSSRSAARPATMTEQAISFAKDFLAGGIAA 
AI S KTAVAP I ER VKLLLQ VQHAS KQ I AADKQ Y KG I VDC I VRI P K 
EQGVLS FWRGNLANV I R Y FPTQALN FAF KD K YKQI FLGG VDKHT 
QF WR Y FAGNLAS GGAAGATSLC FVY PLD F ARTRLAADVGK S GTE 
REFRGLGDCLVKITKSDGIRGLYQGFSVSVQGIIIYRAAYFGVY 
DTAKGMLP D P KNTH I WS WM IAQTVTAVAG WS YPFDTVRRRMM 
MUb^KKAjAJJJLPlY TGT VDLWKKI r KDEGGKAF FKGAWSNVLRGMG 
GAFVLVLYDELKKVI 


6359 


98 


. 1086 


VCRQEEEKMKEDCLPSSHVPISDSKSIQKSELLGLLKTYNCYHE 
GKS FQLRHRE EEGTL 1 1 EGLLN I AWGLRRP IRLQMQDDRE QVHL 
PS TS WMPRR P SC P LKE PS PQNGN I TAQGP S I QPVHKAE S S TDS S 
GPLEEAEEAPQLMRTKSDASCMSQRRPKCRAPGEAQRIRRHRFS 
INGHFYNHKTSVFTPAYGSVTNVRVNSTMTTLQVLTLLLNKFRV 
EDGPSEFALYIVHESGERTKLKDCEYPLISRILHGPCEKIARIF 
LMEADLGVEVPHEVAQYIKFEMPVLDSFVEKLKEEEEREIIKLT 
MKFQALRLTMLQRLEQLVEAK 


6360 


1 


345 


GTRGAVPSTLEEWIjPPRSCRVFWIHSGTTMSKVSFKITLTSDP 
RL P Y KVLS VPES T P FTAVL KFAAEE FKVPAATS A I ITNDG I G I N 
PAQTAGNVFLKHGSELRI IPRDRVGSC 


6361 


615 


158 


RPGLGQLQHCAliAPQAGNRRCRFHGRLHALTRS THRGKPMS I MQ 
FKDTLNT PL PDS S P VAVP LGAP I AVAS TLS VEHNDG VE TG I WAC 
APGRWRRQITSQEFCHFIQGRCTFTPDDGETLHIQAGDALMLPA 
NSTGIWDIQETVRKTYVLIL 


6362 


350 


1576 


TTMDGSHSAALKLQQLPPTSSSSAVSEASFSYKENLIGALLAIF 
GHLWSIALNLQKYCHIRLAGSKDPRAYFKTKTWWLGLFLMLLG 
ELGVFAS YAFAPLS L IVP LS AVS VI AS AI IGI I FI KEKWKPKDF 
LRRYVLSFVGCGLAWGTYLLVTFAPNSHEKMTGENVTRHLVSW 
PFLLYMLVEIILFCLLLYFYKEKNANNIWILLLVALLGSMTW 
TVKAVAGMLVLS I QGNLQLD YP I F YVMFVCMVATAVYQAAFLSQ 
ASQM YDS S LIAS VG Y I LS TT I AITAGAI F YLD F I GED VLH I CM F 
ALGCL I AFLGVFL I TRNRKKP I PFEP Y I SMDAMPGMQNMHDKGM 
T VQPE LKAS FS YGALENNDN I S E I YAP ATLPVMQE EHGS RS ASG 
VPYRVLEHTKKE 


6363 


21 


1201 


RRTRLGS S F PRRRDS S AMES YD VIANQ P W I DNGSGVI KAGFAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGDI FIGPKAEEHRGLLS I 
R YPMEHG IVKDWNDMERI WQ YVYS KDQLQTFSEEHPVLLTEAPL 

N PR KNR E RAAE VF FR TFNVP ATjP T 9MO AVT .<! T . Y ATfil? TTY3 VVT ,n 

SGDGVTHAVPI YEGFAMPHS I MRID IAGRDVSRFLRLYLRJCEGY 
D FHS SSE FEIVKAI KERACYLS INPQKDETLETEKAQYYL PDGS 
T I E IG P S RFRAP ELLFR PDL I G EES EG I HE VL VFA I QKSDMDLR 
RTLFSNI VLSGGSTL FKG FGDRLLS EVKKLAPKDVK I RI S AP QE 
RL YST W I GGS I LAS LDTFKKMWVS KKE YE EDGAR S IHRKT F 


tt«4 


21 


1201 


RRTRLGS S FPRRRDS SAME SYDVIANQPWI DNGSGVI KAGFAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLSI 
RYPMEHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVLLTEAPL 
NPRKNRERAAEVFFETFNVPALFISMQAVLSLYATGRTTGVVLD 
S GDGVTHAVP I YEG FAMPHS I MR I D I AGRD VSRFLRLYLR KEG Y 
DFHSSS E FE IVKAI KERACYLS INPQKDETLETE KAQ Y YL PDGS 
TI3IGPSRFRAPELLFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
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to first 
amino acid 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, KsLysine, 
L=Leucine, M=Methionine, N«Asparagine , 
PaProline, Q=Glut amine, R=Arginine, 
S=Serine, T-Threonine, Vs=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\epossible nucleotide insertion) 








RTLFSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKIRISAPQE 
RLYS T W IGGS I LASLDT FKKMWVS KKE YEEDGARS I HRKT F 


6355 


234 


1989 


KHKSRASCAARAQAFGPSREREVHSRFRSGLRRLGESNSGCCTM " 
Ah>M(j I JUAFDEYGRPFLI I KDQDRKS RLMGLEALKSHIMAAKAVA 
NTMRTS LGPNGLDKMMVDKDGDVTVTNDGAT ILSMMDVDHQ I AK 
LMVELS KSQDDE IGDGTTGVVVLAGALLEEAEQLLDRG IHPIRI 
ADGYEQAARYAI EHLDKI SDSVLVDI KDTEPLIQTAKTTLGSKV 
VNS CHRQMAE IAVNAVLT VADMERRDVDFEL I KVEGKVGGRLED 
TKLIKGVIVDKDFSHPQMPKKVEDAKIAILTCPFEPPKPKTKHK 
LDVTSVEDYKALQKYEKEKFEEMIQQIKETGANLAICQWGFDDE 
ANHLLLQNNLPAVRWVGGPEIELIAIATGGRIVPRFSELTAEKL 
G FAGLVQE I S FGTTKDKMLVT EQC KNS RAVT I F I RGGNKM 1 1 E E 
AKRSLHDALCVI RNLIRDNR WYGGGAAE ISCALAVSQEADKCP 
TLEQYAMRAFADALEVI PMALS ENSGMNP IQTMTE VRARQ VKEM 
N PALG I DCLH KGTNDM KQQHVI ETL IG KKQQ I S LATQMVRM ILK 
IDDIRKPGESEE 


6366 


OCT 

AD / 


1898 


LtfNJUSliAfibiSTt'W Vtiliijl MjUAVAMJjCKEQGITVLGLNAVFDILV 
IGKFNVLE I VQ KVLHKD KS LENLGMLRNGGLLFRMTLLTS GGAG 
MLYVRWRIMGTGPPAFTEVDNPASFADSMLVRAVNYNYYYSLNA 
WLLLCPWWLCFDWSMGCIPLIKSISDWRVIALAALWFCLIGLIC 
QAL CS E DGHKRR I LTLGLG FIjVI P FL PASNLFFRVG F WAERVL 
YLPSVGYCVLLTFGFGALSKHTKKKKLIAAVVLGILFINTLRCV 
LRSGEWRSEEQLFRSALSVCPLNAKVHYNIGKNLADKGNQTAAI 
RYYREAVRLNPKYVHAMNNLGNILKERNELQEAEELLSLAVQIQ 
PDFAAAWMNLGIVQNSLKRFEAAEQSYRTAIKHRRKYPDCYYNL 
GRLYADLNRHVDAIiNAWRNATVLKPEHSLAWNNM 1 1 LLDNTGNL 
AQAEAVGRE ALE L I PNDHSLMFSLANVLGKSQKYKESEALFLKA 
I KANPNAAS YHGNLAVLYHR WGHLDLAKKHYEI S LQLDPTASGT 
KENYGLLRRKLELMQKKAV 


6367 


287 




S IGFP VMLVLS I LLYTCEMFQDS VAFEDVAVS FTQEEWALLDPS 
Q KNLYRDVMQE T FKNLTS VGKTWKVQNI E DE YKNP RRNLS LMRE 
KLCES KESHHCGES FNQIADDMLNRKTLPGI TPCE SSVCGEVGT 

SHDKACTKEKP YDGKECTETF I £ HSC IQRHRVMHSGDGP YKCKF 
CGECAFYFLNLCLIHERIHTGVKPYKCKQCGKAFTRSTTLPVHER 
THTGVNADECKECGNAFS FPSE I RRHKRSHTGEKP YECKQCGKV 
FISFSSIQYHKMTHTGEKPYECKQCGKAFRCGSHLQKHGRTHTG 
EKPYECRQCGKAFRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCA 
SQLQ I HE RTHSGEKPHECKECG KVFKYFS S LRIHERTHTGE KPH 
ECKQCGKAFRYFSSLHIHERTHTGDKPYECKVCGKAFTCSSSIR 
YHERTHTGEKPYECKHCGKAFISNYIRYHERTHTGEKPYQCKQC 
GKAFIRASSCREHERTHTINR 


6368 


1 


327 


RPVPAKLNPRSWPRTAGALPLRP PPLTMAVFHDE VE I EDFQYDE " 
DSETYFYPCPCGDNFS ITKEDLENGEDVATCPSCSLI I KVI YDK 
DQFVCGETVPAPSANKELVKC 


6369 


1 


1745 


AGCCRDTRFPTPRGPGS LCHNFCRSAACTVTRTIHGS PREDTGT 
PRSREMMFQDSVAFEDVAVSFTQEEWALLDPSQKNLYRDVMQET 
FKNLTS VGKTWKVQN I E DE YKNP RRNLS LMRE KLCES KE SHHCG 
ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGHSSLNTHIRAD 
TGHKSSEYQEYGENPYRNKECKKAFSYLDSFQSHDKACTKEKPY 
DGKECTBTFISHSCIQRHRVMHSGDGPYKCKFCGKAFYFLNLCL 
IHE R I HTGVKP YKCKQ CGKAFTRS TTLP VHERTHTG VNADE CKE 
CGNAFSFPSEIRRHKRSHTGEKPYECKQCX3KVFISFSSIQYHKM 
THTGEKPYECKQCGKAFROGSHLQKHGRTHTGEKPYECRQCGKA 
FRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQIHERTHSG 
EKPHECKECGKVFKYFSSLRIHERTHTGEKPHECKQCGKAFRYF 
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Amino acid SecrmGllfc C mi haiirinn ci'rrnal nonh^a 

(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=» Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








S S LH IHERTHTGD KP YE CKVCG KAFTCS SSI RYHERTHTGEKP Y 
EC KHCG KAF I SN Y I R YHERTHTGE KP YOCKO CGKAFTPacqpop. 
HERTHTINR 


6370 


1711 


329 


FVLSEQRLRTSRTWPRSPGLGRGAAAAGARTAGAGLLRLLLGCX3 
ALVGGLR P VTMTT PANAONAS KTW E I »S T , YF T . WT? TOHRJiT MnnTT? 

IAVSPRSLHSELMCPICLDMLKNTMTTKECLHRFCSDCIVTALR 
SGNKECPTCRKKLVSKRSLRPDPWFDALISKIYPSREEYEAHQD 
RVLIRLSRLHNQQALSSSIEEGLRMQAMHRAQRVRRPIPGSDQT 
TTMSGGEGEPGEGEGDGEDVSSDSAPDSAPGPAPKRPRGGGAGG 
SS VGTGGGGTGGVGGGAGSEDSGDRGGTLGGGTLGPPSP PGAPS 
PPEPGGEIELVFRPHPLLVEKGEYCQTRYVKTTGNATVDHLSKY 
LALR I ALE RRQQQEAGE PGG PGGG AS DTGG PDG CGGEGGGAGGG 
DGPEEPALPSLEGVSEKQYTIYIAPGGGAFTTLNGSLTLELVNE 
KFWKVSRPLELCYAPTKDPK 


6371 


3 


288 


GVANMSTAMN FGTKS FQ PRP PDKGS FPLDHLGE CKS FKE KFMKC 
LHNNNFENALCRKE S KE YLECRME RKLMLQE PLE KLGFGDLTSG 


6372 


2141 


625 


R VS AI ASEGKAE ER YKKLEDLLE KS FS LVKM P S LQ PWMC VMKH 
LPKVP EKKLKLVMAD KELYRACAVE VRRQ I WQDNQALFGDE VS P 
LLKQ Y I LEKESALFS TELSVLHNFFS PSPKTRRQGEWQRLTRM 
VGK1WKLYDMVLQFLRTLFLRTRNVHYCTLRAELLMSLHDLDVG 

QVLGDLSMILCDPFAIOTLAI^TVRHLQELVGQETLPRDSPDLL 
LLLRLLALGQGAWDM IDS Q VFKE P KME VEL I TRFLPMLM S FLVD 
DYTFNVDQKLPAEEKAPVSYPNTLPESFTKFLQEQRMACEVGIjY 
YVLHITKQRNKNALLRLLPGLVETFGDLAFGDI FLHLLTGNLAL 
IxADEFALEDFCSSLFDGFFLTASPRKENVHRHALRLLIHLHPRV 
APSKLEALQKALEPTGQSGEAVKELYSQLGEKLEQLiDHRKPSPA 
QAAETPALELPLPSVPAPAPL 


6373 


67 


711 


PSRAARAS PARLPAMVS W IIS RLWL I FGTLYPAYYS YKAVKS K 
D I KE YVKWMM YW I I FALFTTAETFTDI FLCWFPFYYELKIAFVA 
WLLSP YTKGSSLLYRKFVHPTLSS KE KE IDDCLVQAKDRS YDAL 
VH FGKRGLNVAATAAVMAAS KGQGALS ERLRS FSMQDLTT I RGD 
GAPAPSGPP PPG^GR A^GKHGOPKMQR 9 aQTT<? a C Q cr^Ta 


6374 


535 


2105 


HKLFCSYISTSEFPSSTRHHSCPTHTFCNYTSSTIFLSSTRDHS 
CPTHTFCNYTSSTI FLSSTRDHSCPTHTS CNYTSST I FLS STRD 
HSCPTHTSCNYTSSTIFLSSTRDHSCPTHTFCNYPRPIIRLSSC 
CPAELQTEGSNGKKEVLSGFQWLEDTVLFPEGGGQPDDRGTIN 
DISVLRVTRRGEQADHFTQTPLDPGSQVLVRVDWERRFDHMQQH 
SGQHL I TAVADHLFKLKTTS WELGRFRSAI ELDTP S MTAEQVAA 
IEQSVNEKIRDRLPVNVRELSLDDPEVEQVSGRGIiPDDHAGPIR 
WNIEGVDSNMCCGTHVSNLSDLQVIKILGTEKGKKNRTNLIFL 
SGNRVLKWME RS HGTEKALTALLKCGAEDHVEAVKKLQNSTKI L 
QKNNLNLLRDLAVH I AHS LRNS PDWGG WI LHRKEGDS E FMNI I 
ANE I GS E ETLL FLTVGDEKGGGL FLIiAG P PAS VETLGPR VAEVL 
EGKGAGKKGRFQGKATKMSRRMEAQALLQDYISTQSAKE 


6375 


1 


1535 


AIMAAATRPVRLPEAGCEGRERCWNPSRSRSHSGEGGLAAWSRT 
CPGRPRRPGQQVVTIGPTMLVTAYIjAFVGLIjASCLGLELSRCRAK 
PPGRACSNPSFLRFQLDFYQVYFLALAADWLQAPYLYKLYQHYY 
FLEGQIAILYVCGLASTVLFGLVASSLVDWLGRKNSCVLFSLTY 
SLCCLTKLSQDYFVLLVGRALGGLSTALLFSAFEAWYIHEHVER 
HDFPAEWI PATFARAAFWNHVLAWAGVAAEAVASWIGLGPVAP 
FVAAI PLLALAGAIaALRNWGENYDRQRAFSRTCAGGLRCIjLSDR 
RVLLLGTIQALFESVI FIFVFLWTPVLDPHGAPLGI I FSSFMAA 
SLLGSSLYRIATSKRYHLQPMHLLSLAVLIWFSLFMLTFSTSP 
GQESPVESFIAFLLIELACGLYFPSMSFLRRKVTPETEQAGVLN 
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Amino acid segment containing signal peptide 
<A«Alanine, C«Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








WFRVPLHSLACLGLLVLHDSDRKTGTRNMFSICSAVMVMALIAV 
VGLFTWRHDAELRVPSPTEEPYAPEL 


6376 


380 


1437 


ISSTDIDHYRFSFLVNSKMPSKESWSGRKTNRAAVHKSKQEGRQ 
QDLLIAALGMKLGSPKSS VT I WQPLKLFAYS QLTS LVRRATLKE 
NEQIPKYEKIHNFKVHTFRGPHWCEYCANFMWGLIAQGVKCADC 
GLNVHKQCS KMVPNDCKPDLKHVKKVYS CDLTTLVKAHTTKRPM 
WDMC I RE I BSRGLNSEGLYRVSGFSDL I ED VKMAFDRDGE KAD 
I S VNMYED INI I TGALKLYFRDLPI PI*I T YDAYPKF I ESAKIMD 
PDEQLETLHSALKLLPPAHCETLRYLMAHLKRVTLHEKENLMNA 
ENLGIVFGPTLMRSPELDAMAALNDIRYQRLWELLIKNEDILF 


6377 


2311 


1845 


SRIRRRSSRRPREPPGPSRRRRRRRPDPRTMPSEKTFKQRRTFE ™ 
QRVEDVRLIREQHPTKIPVIIERYKGEKQLPVLDKTKFLVPDHV 
NMSELIKIIRRRLQLNANQAFFLLVNGHSMVSVSTPISEVYESE 
KDEDGFL YMVYAS QETFGMKLS V 


6378 


686 


191 


GAGPWEAFPDGIGRRSRRARLPQYKRPPGRVGGGDSGRRNMAVA 
DLALI PDVDI DSDGVFKYVL IRVHS APRSGAPAAE SKE I VRGYK 
WAE YHAD I YDKVSGDMQKQGCD CECLGGGR I S HQS QDKKI HVYG 
YS MAYG P AQHAI S TEK I KAKYPDYE VTWANDG Y 


6379 


35 


378 


BRAGS PS PSRAALRRCAPQRSQAPRWPDRAACRRS FQGSQGRAY 
LFNSWNVGCGPAEERVLLTGLHAVADIYCENCKTTLGWKYEHA 
FESSQKYKEGKYI IELAHMIKDNGWD 


■ 6380 


1414 


462 


PAVQGQRGAGP PTGRGSGNMARFALTWRHGETRFNKEKI IQGQ 
GVDEPLS ETGFKQAAAAGIFTiNNVKFTHAFSSDLMRTKQTMHGI 
LERS KFC KDMT VfCYDS RLRERK YG WEG KAL S EL RAMAKAARE E 
CPVFTPPGGETLDQVKMRGIDFFEFLCQLILKEADQKEQFSQGS 
PSNCLETS LAE I FPLGKNHSSKVNSDSG I PGLAAS VLWSHGAY 
MRSLFDYFLTDLKCSLPATLSRSELMSVTPNTGMSLFIINFEEG 
REVKP T VQC I CMNLQDHLNGLTENS LGLNLPS KSNHFE PLKGVP 
LALFTSLLC 


6381 


1668 


218 


AWRAQGSRGFS GAGWRPRQAAAMNFSEVFKLSSLLCKFS PDGK 
YLAS CVQ YRLWRDVNTLQ I LQL YTCLDQI QH I E W S ADS L F I LC 
AMYKRGLVQVWS LEQPEWHCKIDEGSAGLVASCWS PDGRH I LNT 
TEFHLR I TVWSLCTKS VS Y I KYP KACLQGITFTRDGR YMALAER 
RDC KD YVS I F VCS D WQLLRH FDTDTQDLTG I EWAPNGCVLAVWD 
TCLE YKI LLYS LDGRLLSTYSAYEWSLGIKS VAWS PSSQFLAVG 
S YDG KVR I LNHVTWKM I TE FGHPAA INDPKI WY KEAE KS PQ LG 
LGCLSFPPPRAGAGPLPSSESKYEIASVPVSLQTLKPVTDRANP 
KIG IGMLAFS PDS YFLATRNDNIPNAVWVWDIQKLRLFAVLEQL 
SPVRAFQWDPQQPRLAICTGGSRLYLWSPAGCMSVQVPGEGDFA 
VLSLCWHLSGDS MALLS KDHFCLCFLETEAWGTACRQLGGHT 


6382 


2 


1062 


FE EDE DRNLCL I A YPLKGDHG I VD I VDNS DCE PKS KLLRWTTNK 
KHHVLETEKTPKDWVRQHRKEEKMKSHKLEEEFEWLKKSEVLYY 
TVEKKGN I SSQLKHYNPWSMKCHQQQLQRMKEN7UCHRNQ YKFIL 
LENLTSRYEVPCVLDLKMGTRQHGDDASEEKAANQIRKCQQSTS 
AVIGTOVCGMQVYQAGSGQLMFMNKYHGRKLSVQGFKEALFQFF 
HNGRYLRRELLGPVLKKLTELKAVLERQESYRFYSSSLLVIYDG 
KERPEWLDSDAEDLEDLSEESADESAGAYAYKPIGASSVDVRM 
IDFAHTTCRLYGEDTWHEGQDAGYI FGLQSLIDIVTE ISEESG 
E 


6383 


3159 


1061 


SPAPGRPSPHGSQPAARAAAAPAMPSAKQRGSKGGHGAASPSEK 
GAHPSAARPLAAPTPAAPACRS PS PGGAPAS FPGRAPRSLASQP 
AARAAAAPAMPSAKQRGSKGGHGAAS PS EKGAHPS GGADDVAKK 
PPPAPQQPPPPPAPHPQQHPQQHPQNQAHGKGGHRGGGGGGGKS 
S S S S S AS AAAAAAAAS S S AS CS RRLGRALNFL FYLALVAAAAFS 
G W CVHHVLE E VQQ VRRS HQD FS RQREELGQGLQG VEQKVQSLQA 
TFGTFESILRSSQHKQDLTEKAVKQGESEVSRISEVLQKLQNEI 
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Amino acid segment containing signal peptide 
<A»Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /«possible nucleotide deletion, 
\°possible nucleotide insertion) 








LKDLSDGIHWKDARERDFTSLENTVEERLTELTKSINDNIAIF 
TEVQKRSQKEINDMKAKVASLEESEGNKQDLKALKEAVKEIQTS 
AKSRE WDMEALRS TLQTMESDI YTEVRELVS LKQEQQAFKEAAD 
TERLALQALTEKLLRS EESVS RLPEE I RRLEEELRQLKSDSHGP 
KEDGGPRHSEAFEALQQKSQGLDSRLQHVEDGVLSMQVASARQT 
ESLESLLS KSQEHEQRLAALQGRLEGLGS S EADQDGIiASTVRS L 
GETQLVLYGDVEELKRSVGELPSTVESLQKVQEQVHTLLSQDQA 
QAARLPPQDFLDRLSSLDNLKASVSQVEADLKMLRTAVDSLVAY 
SVKIETNENNLESAKGLLDDLRWDLDRLFVKVEKIHEKV 


6384 


738 


1904 


IWEVPVCLTHLLHLQQANQPLPPPSSSINEEDADEANRAIGEKR 
AAPDSG KKP KTPKTKQ Q KDPNE PQKP VS AYALFFRDTQAAI KGQ 
NPNATFGEVSQIVASMWDSLGEEQKQVYKRKTEAAKKEYLKALA 
AYRASLVSKAAAESAEAQTIRSVQQTLASTNLTSSLLLNTPLSQ 
HGTV3A3 PQTLQQSLPRSIAPKPLTMRLPP4NQIVTSVTIAANMP 
SNIGAPLISSMGTTMVGSAPSTQVSPSVQTQQHQMQLQQQQQQQ 
QQQMQQMQQQQLQQHQMHQQ I QQQMQQQHFQHHMQQHLQQQQQH 
LQQQINQQQLQQQLQQRLQLQQLQHMQHQSQPSPRQHSPVASQI 
TSPIPAIGSPQPASQQHQSQIQSQTQTQVLSQVSIF 


6385 


2 


1584 


PRVRAADVAAGAQAWSAGMAKSNGENGPRAPAAGESLSGTRES 
LAQGPDAATTDELSSLGSDSEANGFAERRIDKFGFIVGSQGAEG 
ALE E VPLEVLRQRE S KWLDMLNNWDKWMAKKHKKI RLRCQ KG I P 
P S LRGRAWQ YLSGG KVKLQQNPGKFDELDMS PGD P KWLDV I ERD 
LHRQFPFHEMFVSRGGHGQQDLFRVLKAYTLYRPEEGYCQAQAP 
IAAVLLMHMPAEQAFWCLVQICEKYLPGYYSEKLEAIQLDGEIL 
FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCAFSRTLPWSSVL 
RVWDMFFCEGVKI I FRVGLVLLKHALGSPEKVKACQGQYETI ER 
LR S L S P K I MQE AFLVQE WE LP VTERQI EREHL I QLRR WQE TRG 
ELQCRSPPRLHGAKAILDAEPGPRPALQPSPSIRLPLDAPLPGS 
KAKP KP PKQAQKEQRKQMKGRGQLE KP PAPNQAMWAAAGDACP 
PQHVPPKDSAPKDSAPQDLAPQVSAHHRSQESLTSQESEDTYL 


6386 


819 


195 


T VCGS F YLG I MQRAS R L KRE LHMLATEP PPG> t TCWQDKDQMDDL 
RAQILGGANTPYEKGVFKLEVIIPERYPFEPPQIRFLTPIYHPN 
IDSAGRICLDVLKLPPKGAWRPSLNIATVLTSIQLLMSEPNPDD 
PLMADI SSEFKYNKPAFLKNARQWTEKHARQKQKADEEEMLDNL 
PEAGDSRVHNSTQKRKASQLVGIEKKFHPDV 


6387 


1 


662 


PGPTHAS ADAWADAWAQPNMAMHNKAAPPQI PDTRRELAELVKR 
KQE LAETLANLERQ I YAFEGS YLEDTQM YGNI I RGWDRYLTNQK 
NSNSKNDRRNRKFKEAERLFSKSSVTSAAAVSALAGVQDQLIEK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKLNKKPRADY 


6388 


1 


662 


PGPTHASADAWADAWAQPNMAMHNKAAPPQI PDTRRELAELVKR 
KQELAETLANLERQ I YAFEGS YLEDTQMYGNI IRGWDRYLTNQK 
NSNS KNDRRNRKFKEAERLFS KS S VTS AAAVS ALAG VQDQL I EK 
REPGSGTESDTS PDFHNQENEPSQED PEDLDGS VQGVKPQKAAS 
oXMuonnaonAlUiAiNl^KHbrbfjMc ui DFEIDLKLNKKPRADY 


6389 


1074 




AEPGDRMAGHRLVLVLGDLHIPHRCNSLPAKFKKLLVPGKIQHI 
LOrcNLCTKESYDYLKTLAGDVHIVRGDFDENLNYPEQKVVTVG 
QFKIGL IHGHQVI P WGDMASLALLQRQFDVDILI SGHTHKFEAF 
EHENKFYINPGS ATGAYNALETNI IPS FVLMDIQAS TWTYVYQ 
LIGDDVKVERIEYKKP 


6390 


158 


535 


GEERKEGRAPGKAFAPERNPAKMEKEETTRELLLPNWQGSGSHG 
LTI AQRDDGVFVQE VTQNS PAARTGWKEGDQI VGATI YFDNLQ 
SGEVTQLLNTMGHHTVGLKLHRKGDRFFPSLGQTWDP 


6391 


5386" 


2897 


VRWNSKTECYLS IQTQENFPANLNELVNCIVI SSLVTTQRJKLKA 
MSLLGSRNQLARAVLNPNPMDFCTKDLLTTTSERIIAYLRDFNE 
DQKKAIETAYAMVKHSPSVAKICLIHGPPGTGKSKTIVGLLYRL 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
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amino acid 
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Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=*Aspartic Acid, E=* 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M*Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LTENQRKGHSDENSNAKI KQNRVLVCAPSNAAVDELMKKI I LE F 
KE KCKD KKNPLGNCGD I NL VRLG PEKSINSEVLKFS LDS QVNHR 
MKKELPSHVQAMHKRKEFLDYQLDELSRQRALCRGGREIQRQEL 
DENIS KVSKERQELAS KIKEVQGRPQKTQS 1 1 ILESHI ICCTLS 
TSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEIETLTPLIHRCN 
KLILVGDPKQLPPTVISMKAQEYGYDQSMMARFCRLLEENVEHN 
MI SRLP I LQLTVQYRMHPDI CLFPSNYVYNRNLKTNRQTEAIRC 
S SDWPFQP YLVFDVGDGSERRDNDS Y INVQE I KLVME 1 1 KL I KD 
KRKDVSFRNIGIITHYKAQKTMIQKDLDKEFDRKGPAEVDTVDA 
FQGRQ KDC V I VTC VRANS IQGS I G FLAS LQRLNVT I TRAKYSL F 
ILGHLRTLMENQHWNQLIQDAQKRGAI I KTCDKNYRHDAVKILK 
LKPVLQRSLTHPPTIAPEGSRPQGGLPSSKLDSGFAKTSVAASL 
YHTPSDS KE ITLTVTS KDPERP PVHDQLQDPRLLKRMG I EVKGG 
IFLWDPQPSSPQHPGATPPTGEPGFPVVHQDLSHVQQPAAVVAA 
LS SHKP PVRGE PPAAS PEASTCQS KCDDPEEELCHRREARAFS E 
GEQEKCGSETHHTRRNSRWDKRTLEQEDSSSKKRXLL 


6392 


972 


186 


GRTGVDLASSMAHRLQIRLLTWDVKDTLLRLRHPLGEAYATKAR 
AHGLEVEPSALEQGFRQAYRAQSHSFPNYGLSHGLTSRQWWLDV 
VLQTFHIAGVQDAQAVAPIAEQLYKDFSHPCTWQVLDGAEDTLR 
ECRTRGLRLAVISNFDRRLEGILGGLGLREHFDFVLTSEAAGWP 
KPDPRIFQEALRLAHMEPWAAHVGDNYLCDYQGPRAVGMHSFL 
WGPQALDPWRDSVPKEHILPSLAHLLPALDCLEGSTPGL 


6393 


2017 


730 


TGGS KMAAVAT CGS VAASTGSAVATAS KSNVTS FQRRGPRASVT 
ND S G PRLVS I AGTR PS VRNGQLLVS TGLP ALDQ LLGGGLAVGTV 
LLIEEDKYNIYSPLLFKYFLAEGIVNGHTLLVASAKEDPANILQ 
BLPAPLLDDKCKKEFDEDVYNHKTPESNIKMKIAWRYQLLPKME 
IGPVSSSRFGHYYDASKRMPQELIEASNWHGFFLPEKISSTLKV 
EPCSLTPGYTKLLQFIQNIIYEEGFDGSNPQKKQRNILRIGIQN 
LGS PLWGDD I CCAENGGNSHSLTKFLYVLRGLLRTSLSAC 1 1 TM 
PTHLIQNKAIIARVTTLSDVWGLESFIGSERETNPLYKDYHGL 
IHIRQIPRLNNLICDESDVKDLAFKLKRKLFTIERLHLPPDLSD 
TVS RS S KMDLAESAKRLGPG CGMMAGGKKHLD F 




— TTTp 


511 


GAAAGGEGARRR PAAMAT VMAATAAERAVLE Efe! FRWLLHDE VHA 
VLKQLQDI LKEASLRFTLPGSGTEGPAKQENF I LGS CGTDQ VKG 
VLTLQGDALSQADVNLKMPRNNQLLHFAFREDKQWKLQQIQDAR 
NHVS QAI YLLTSRDQS YQ FKTGAEVLKLMDAVMLQLTRARNRLT 
T PATLTLPE I AASGLTRM FAP ALPSDLLVNVY I NLNKLCLTVYQ 
liHALQPNSTKNFRPAGGAVLHSPGAMFEWGSQRLEVSHVHKVEC 
VIPWLNDALVYFTVSLQLCQQLKDKISVFSSY^SYRPF 


6395 


13 


658 


PSGRPTRPLCCAARRGAARHGGSVSGWPAGRTPTETSNPGSSVM 
ESVT FEDVAVE F IQE WALLDS ARRS LC KYRMLDQ CRTLAS RGT P 
PCKPSCVSQLGQRAEPKATERGILRATGVAWESQLKPEELPSMQ 
DLLBEASSRDMQMGPGLFLRMQLVP S I EERETPLTREDRPALQE 
PPWSLGCTGLKAAMQIQRWI P VPTLGHRN P WVARDS GE 


*396 


1 


1221 


ANILS S PSKRGOKGTTiTf5Y9 P'RRT'PT.VTCrPMfsrin ffnuconc t nob 
I KES L KQ I LEES DSRQ I F YFL CLNLL FTFVELF YGVLTNS LGL I 
S DGFHMLFDCS ALVMGLFAALM SRWKATR I FS YG YGR I E I LS G F 
INGLFLIVIAFFVFMESVARLIDPPELDTHMLTPVSVGGLIVNL 
I G I CAFS HAHSHAHGASQGS CHS SDHS HSHHMHGHSDHGHGHSH 
GSAGGGMNANMRGVFLHVLADTLGSIGVIVSTVLIEQFGWFIAD 
PLCSLFIAILIFL6WPLIKDACQVLLLRLPPEYEKELHIALEK 
IQKIEGLISYRDPHFWRHSASIVAGTIHIQVTSDVLEQRIVQQV 
TG I LKDAG VNNLT I QVE KEAY FQHMSGLS TGFHDVLAMTKQMES 
MKYCKDGT Y I M 


6397 


391 


122 


GAGGVGRFEAIRAPARMIEWCNDRLGKKVRVKCNTDDTIGDLK 
KLIAAQTGTRWNKIVLKKWYTIFKDHVSLGDYEIHDGMNLELYY 
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(A=Alanine, C=Cysteine, DoAspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








Q 


S398 


353 


1306 


HKQMGPLINRCKKILLPTTVPPATMRIWLLGGLLPFLLLLSGLQ 
RPTEGSEVAIKI DFDFAPGS FDDQYQGCSKQVMEKLTQGDYFTK 
D I EAQ KNY FRMWQKAHLAW LNQG KVLPQNMTTTHAVAILFYTLN 
SNVHSDFTRAMAS VARTPQQ YERS FHFKYLHYYLTS AIQLLRKD 
S I MENGTLCYEVHYRTKDVHFNAYTGATIRFGQFLSTSLLKSEA 
QE FGNQTL FT I FTCLGAP VQ YFSLKKEVL I P P YE L FKV I NMS YH 
PRGDWLQLRSTGNLSTYNCQLLKASS KKCI PDP I AI ASLSFLTS 
VIIFSKSRV 


6399 


75 


1245 


PNLETYFGRRCEKDSMNFTPTHTPVCRKRTWSKRGVAVSGPTK 
RRGMADSLESTPLPSPEDRLAKLHPSKELLEYYQKKMAECEAEN 
EDLLKKLELYKEACEGQHKLECDLQQREEEIAELQKALSDMQVC 
LFQEREHVLRLYSENDRLRIRELEDKKKIQNLLALVGTDAGEVT 
YFCKEPPHKVTILQKTIQAVGECEQSESSAFKADPKISKRRPSR 
ERKESSEHYQRDIQTLILQVEALQAQLGEQTKLSREQIEGLIED 
RRIHLEEIQVQHQRNQNKIKELTKNLHHTQELLYESTKDFLQLR 
SENQNKEKSWMLEKDlsn^SKIKQYRVQCKKKEDKIGKVLPVMHE 
SHHAQSEYIKVMSLCRNE WYFSGRVEG I PKNLQFVM 


6400 


2520 


1053 


KTMKCDEVVYEVQSAILRHNCGYAMKTGKFFHNLMERKDFETWL 
DNISVTFLSLTDLQKNETLDHIilSLSGAVQLRHLSNNLETLLKR 
DFLKLLPLEIiSFYLLKWLDPQTLLTCCLVSKQWNKVISACTEVW 
QTACKNLGWQIDDSVQDALHWKKVYLKAILRMKQLEDHEAFETS 
SL IGHSAR VYAL Y Y KDGLL CTGSDD L S AKLWD VS TGQ CVYG I QT 
HTCAAVKFDEQKLVTGSFDNTVACWEWSSGARTQHFRGHTGAVF 
SVDYNDELDILVSGSADFTVKVWALSAGTCLNTLTGHTEWVTKV 
VLQKCKVKSLLHSPGDYILLSADKYEIKIWPIGREINCKCLKTL 
S VSEDRS I CLQPRLHFDGKY I VCS SALGLYQWDFAS YDILRVI K 
TPEIANLALLGFGDIFALLFDNRYLYIMDLRTESLISRWPLPEY 
RKSKRGSS FLAGEASWLNGLDGHNDTGLVFATSMPDHS I HLVLW 
KEHG 


6401 


109 


766 


PGAAWSRPDLRGCCTGPQPALRMLVLPSPCPQPLAFSSVETME6 
PPRRTCRSPEPGPSSS IGSPQASS PPRPNHYLLIDTQGVPYTVL 
VDEESQRE PGASGAPGQ KKC YS CP VCS RVFE YMS YLQRHS I THS 
EVKPFECDICGKAFKRASHLARHHSIHIAGGGRPHGCPLCPRRF 
RDAGELAQHSRVHSGERPFQCPHCPRRFMEQNTLQKHTRWKHP j 


6402 


1196 


279 


TTSQCGGIRQSSAIPVASMEFAAICLRNAIiLLLPEEQQDPKQEN 
GAKNSNQLGGNTES SESS ETCS S KSHDGD KF I PAPPS S PLRKQE 
LENLKCS ILACSAYVALALGDNLMALNHADKLLQQPKLSGSLKF 
LGHL YAAEAL I S LDlk I S DAI THLNPENVTD VS LG I S SNEQDQGS 
DKGENEAMESSGKRAPQCYPSSVNSARTVMLFNLGSAYCLRSEY 
DKARKCLHQAASMIHPKEVPPBAILLAVYLELQNGNTQLALQI I 
KRNQLLPAVKTHSEVRKKPVFQPVHPIQPIQMPAFTTVQRK 


6403 


2 


1690 


RG IHTSVLQGNLQNQM YSHNWIMNLNNLNLTQVQQRNLI TNLQ 
RS VDDTSQ AIQR I KND FQNLQQ VFLQAKKDTD WLKEKVQS LQTL 
AANNSAIjAKANNDTLEDMNSQLNS FTGQMENITTI SQANEQNLK 
DLQDLHKDAENRTA I KFNQ LEER FQLFETD I VN I I SNI S YTAHH 
LRTLTSNLNEVRTTCTDTLTKHTDDLTSLNNTLANIRLDSVSLR 
MQQDLMRS RLDTEVANLS VI MEEMKLVDSKHGQL I KNFT I LQG P 
PGPRGPRGDRGSQGPPGPTGNKGQKGEKGEPGPPGPAGERGPIG 
PAGPPGERGGKGSKGSQGPKGSRGSPGKPGPQGPSGDPGPPGPP 
GKEGLPGPQGPPGFQGLQGTVGEPGVPGPRGLPGLPGVPGMPGP 
KG PPGPPG PSGAWPLALQWE PTPAPEDNSCPPHWKNFTDKCYY 
FSVEKEIFEDAKLFCEDKSSHLVFINTREEQQWIKKQMVGRESH 
WIGLTDSERENEWKWLDGTSPDYKNWKAGQPDNWGHGHGPGEDC 
AGLIYAGQWNDFQCEDVNNFICEKDRETVLSSAL 


6404 


1012 


222 


AAALAMAAPAPGLISVFSSSQELGAAIiAQLVAQRAACCLAGARA 



499 



WO 01/53312 



PCTYUS00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W^Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








rfalglsggslvsmlarelpaavapagpaslarwtlgfcderLv 
pfdhaestyglyrthllsrlpipesqvitinpelpveeaaedya 
kklrqafqgdsipvfdllilgvgpdghtcslfpdhpllqereki 

VAPISDS PKPPPQRVTI/TLPVLNAARTVI F VATGBGKAAVLKR I 
LEDQEENPLPAALVQPHTGKLCWFLDEAAARLLTVPFEKHSPL 


6405 


1 


1456 


AALPRPTPRAPLGREGTGSDSEMAASMFYGRLVAVATLRNHRPR - 
TAQRAAAQVLGSSGLFNNHGLQVQQQQQRNLSLHEYMSMELLQE 
AGVS VPKG YVAKS PDEAYAIAKKLGSKDWI KAQVLAGGRGKGT 
FE SGLKGG VKI VFS P E EAKAVS S QM I G KKLFTKQTG E KGRI CNQ 
VL VCER K YPRRE Y YFAI TME R S FQGPVL I G S S HGG VN I E D VAAE 
TPEAIIKEPIDIEEGI KKEQ ALQLAQKMG F P PNTVE S AAENMVK 
LYSLFLKYDATMIBINPMVEDSDGAVLCMDAKINFDSNSAYRQK 
KIFDLQDWTQEDERD KD AAKANLNY I G LDGN I G CL VNG AG LAMA 
TMDI IKLHGGTPANFLDVGGGATVHQVTEAFKLITSDKKVLAI L 
VNIFGGIMRCDVIAQGIVMAVKDLEIKIPVWRLQGTRVDDAKA 
LIADSGLKILACDDLDEAARMWKLSEIVTLAKQAHVDVKFQLP 
I 


6406 


1036 


167 


HP RQMRGE DT P E AP P YS S GR YDS I KTE VS GCP EDLT VGRAPTAD 
D D DDDHDD HE DNDKMND S EGMDPERLKAFNMF VRLFVDENLDRM 
VPISKQPKEKIQAIIESCSRQFPEFQERARKRIRTYLKSCRRMK 
KNGMEMTR PTP PHLTSAMAENI LAAACES ETRKAAKRMRLE I YQ 
SSQDEPIALDKQHSRDSAAITHSTYSLPASSYSQDPVYANGGLN 
YS YRG YGALS S NLQ P PAS LQTGNHSNGE S GE ARALAS R P APS WV 
CRAALGSGMGRG KQ R P VMERG CLTA 


6407 


492 


150 


VGLCLAVS QTVLAQLDALLVF PGQ VAQLS CTLS PQHVTI RDYGV 
SWYQQRAGSAPRYLLYYRSEEDHHRPADIPDRFSAAKDEAHNAC 
VLTISPVQPEDDADYYCSVGYGFSP 


6408 


1458 


903 


RGC I TS SQ AWRL FGG VTRGFNMR I E KCYFCSGP I YPGHGMMFVR 
NDCKVFRFCKSKCHKNFKKKRNPRKVRWTKAFRKAAGKELTVDN 
S FE FEKRRNE P I KYQRELWNKTI DAMKRVEE I KQKRQAKF IMNR 
LKKNKELQKVQDIKEVKQNIHLIRAPLAGKGKQLEEKMVQQLQE 
DVDMEDAP 


6409 


150 


446 


NTALANLLRCFTCDRLCGGCTAPAPPAHQGIVLQPVMPSCDPGP 
GPACLPTKTFRSYLPRCHRTYSCVHCRAHLAKHDELISKSFQGS 
HGRAYLFNSV 


6410 


85 


607 


RGGTAGCVACLGCWGQSSSPKAAFPAGSACLPADSCPCLLFQAC ' " 
AI SGLFNC ITIHPLNIAAGVWMIMNAFILLLCEAPFCCQFI EFA 
NTVAEKVDRLRSWQKAVFYCGMAVVPIVISLTLTTLLGNAIAFA 
TGVL YGLS ALGKKGDAI S YARIQQQRQQADEEKLAETLEGEL 


6411 


302 


772 


RLSIMASSLNEDPEGSRITYVKGDLFACPKTDSLAHCISEDCRM 
GAGIAVLFKKKFGGVQELLNQQKKSGEVAVLKRDGRYIYYLITK 
KRASHKPTYENLQKSLEAMKSHCLKNGVTDLSMPRIGCGLDRLQ 
WENVSAMIEEVFEATDIKITVYTL 


6412 


61 i 


1709 


RPVTS FS PLPGSCGGRLGTRTMLGRSLREVSAALKQGQ I TPTEL - 
CQKCLSLI KKTKFLNAYITVS EEVALKQAEESEKRYKNGQSLGD 
L1X3IPIAVKDNFSTSGIETTCASNMLKGYIPPYNATVVQKLLDQ 
GALLMGKTNLDE FAMGSGSTDGVFGP VKNPWS YS KQYREKRKQN 
PHSENEDSDWLITGGSSGGSAAAVSAFTCYAALGSDTGGSTRNP 
AAHCGLVGFKPS YGLVSRHGL IPLVNSMDVPG ILTRCVDDAAI V 
IiSALAGPDPRDSTTVHEPINKPFMLPSLADVSKLCIGIPKEYLV 
PELSSEVQSLWS KAADLFES EGAKVI E VSLPHTS YS I VC YHVLC 
TS E VASNMARFDGLQ YGHRCD I DVSTEAM YAATRREG FND WRG 
R ILSGNFFLLKENYENYFVKAQKVRRL IANDFVNAFNSGVDVLL 
TPTTLS EAVP YLEF I KEDNRTRSAQDD I FTQAVNMAGLPAVS I P 
VALSNQGLPIGLQFIGRAFCDQQLLTVAKWFEKQVQFPVIQLQE 
LMDDCSAVLENEKLASVSLKQ 
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Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L= Leu cine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=*Arginine, 
S=Serine, T=Threonine, V«Valine, j 
W=Tryptophan, YoTyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6413 


2 


885 


HEPRCAGMAASLWMGDLEPYMDENFI SRAFATMGETVMSVKI IR 
NRLTG I PAG YCFVBFAD LATAE KCLHKINGKPL PGAT P AKRPKL 
NYATYGKQPDNSPEYSLFVGDLTPDVDDGMLYEFFVKVYPSCRG 
GKWLDQTGVSKGYGFVKFTDELEQKRALTECQGAVGLGSKPVR 
LSVAIPKASRVKPVEYSQMYSYSYNQYYQQYQNYYAQWGYDQNT 
GSYSYSYPQYGYTQSTMQTYEEVGDDALEDPMPQLDVTEANKEF 
MEQ S E BL YDALMD CHWQ PLDTVS S E I PAMM 


6414 


1 


538 


RGGRAALLPWRRFPCCRPRPQPARPSSRATPGPRSPGMATSldV 
SFSVGDGVPEAEKNAGEPENTYILRPVFQQRFRPSWKDCIHAV 
LKEELANAEYSPEEMPQLTKHLSENIKDKLKEMGFDRYKMWQV 
VIGEQRGEGVFMASRCFWDADTDNYTHDVFMNDSLFCWAAFGC 
FYY 


6415 


2 


1168 


FVRQWQSSHRRACGLGCEARAGGGEEPRGRASSVAGWVGAFRAP 
F I EAAVAGLGAGSGKRRRGWKMP VHSRGDKKETNHHDEME VDYA 
EWEGSSSEDEDTESSSVSEDGDSSEMDDEDCERRRMECLDEMSN 
LEKQFTDLKDQLYKERLSQVDAKLQEVIAGKAPEYLEPLATLQE 
NMQIRTKVAGIYRELCLESVKNKYECEIQASRQHCESEKLLLYD 
TVQSELEEKIRRLEEDRHSIDITSELWNDELQSRKKRKDPFWPD 
KKKPGWSGPYIVYMLQDLDILEDWTTIRKAMATLGPHRVKTEP 
P VKLEKHLHS ARS EEGRL YYDGE W YI RGQT I C I DKKD EC P TS AV 
ITTINHDEVWFKRPDGSKSKLYISQLQKGKYSIKHS 


£416 


410 


1519 


E I APADLE I PACAPVLLSRATSS TMS VTGGKMAPSLTQE I LSHL 
GLAS KTAAWGTLGTLRT FLNF S VDKD AQRLLRAI TGQG VDRSAI 
VDVLTNRSREQRQLISRNFQERTQQDLMKSLQAALSGNLERIVM 
ALLQPTAQFDAQELRTALKASDSAVDVAIEILATRTPPQLQECL 
AVYKHNFQVEAVDGITSETSGILQDLLLALAKGGRDSYSGIIDY 
NLAEQDVQALQRAEGPSREETWVPVFTQRNPEHLIRVFDQYQRS 
TGQELEEAVQWRFHGDAQVALLGLASVIKNTPLYFADKLHQALQ 
ETEPNYQVLIRILISRCETDLLSIRAEFRKKFGKSLYSSLQDAV 
KGDCQSALLALCRAEDM 


6417 


1 


845 


RGESRVLWS ELEGEAGGAGG WAS S LNARMDNRFATAF VI ACVLS 
LISTI YMAAS IGTDFWYE YRSPVQENSSDLNKS I WDEFISDEAD 
E KTYNDAL FRYNGTVGLWRRC IT I PKNMHW YS P P E RTES FDWT 
KCVSFTLTEQFMEKFVDPGNHNSGIDLLRTYLWRCQFLLPFVSL 
GLMCFGAL IGLCACI CRSLYPTI ATG ILHLLAGLCTLGS VS C YV 
AGIELLHQKLELPDNVSGEFGWSFCLACVSAPLQFMASALFIWA 
AHTNRKEYTLMKAYRVA 


6418 


2 


662 


TRTRPRRPPGLGAAVGKAGARSTSTPAGASPAAAYQADPPPPAH 
TPAP P P PP P CGG I ACHGE PAKFYG YDNLQRQ P I FTTQQE AELVQ 
YPDCKSSSGNIGEDPDHLNQSSSPSQMFPWMRPQAAPGRRRGRQ 
TYS RFQTLE LE KE FLFN P YLTRKRR I EVSHALALTERQVK I WFQ 
NRRMKWKKENNKDKFPVSRQEVKDGETKKEAQELEEDRAEGLTN 


6419 


1 


973 


PGRPRVRNFDLNSKSILQEFFCTRS IQI PANRSKTAMSKdP t FP 
MARSISTSGPLDKEDTGRQKLISTGSLPATLQGATDSLGLEWHL 
PS PDP VTVP YLS PLWWKELESLLENEGDHAI TVADFVDHHP I V 
FWNLWYFRRLDLPSNLPGLILSSEHCNKYSKIPRHCMSEDSKY 
VL I QMLWDNMKLHQDPGQ P L Y I LWNAHTQKYPMVHLLQKS DNS F 
NQELLKSMVKSIKMNDVYGPMSQILETLNKCPHFKRQRSLYREI 
LFLS L VALGREN I D I DAFDKE YKMA YDRLTPSQVKS THNCDR P P 
STGVMECRKTFGEPYL 


6420 


207 


1187 


RKMIDKNQTCGVGQDS VP YMI CLIH I LEEWFGVEQLEDYLNFAN 
YLLWVFTPLILLILPYFTI FLLYLTI IFLHI YKRKNVLKEAYSH 
NLWDGARKTVATL WDGHAAVWHG YE VHGME KI PEDGPAL 1 1 F YH 
GAI P I DFYYFMAKI F I H KGRTCRWADHFVFKI PGFS LLLDVFC 
ALHGPREKCVEI LRSGHLLAI S PGGVREALI SDET YN I VWGHRR 
G FAQVAI DAKVP 1 1 PMFTQNIREGFRSLGGTRLFRWLYEKFRYP 
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FAPMYGGFP VKLRTYLGDP I PYDPQITAEELAEKTKNAVQALID 
KHQRIPGNIMSALLERFH 


6421 


1844 


362 


walslrrqpermsnkllsphphsvvlrsefkmasspavlrasrl 
yqwslkssaqflgspqlrqvgqiirvparmaatlilepagrccw 
depvriavrglapeqpvtlraslrdekgalfqaharyradtlge 
ldlerapalggs fagle pmgllwale pe kplvrlvkrd vrt p la 
velevldghdpdpgrllcqtrheryflppgvrrepvrvgrvrgt 
lflppepgpfpgivdmfgtggglleyrasliagkgfavmalayy 
nyedlpktmetlhleyfeeamnyllshpevkgpgvgllgiskgg 
elclsmasflkgitaawingsvanvggtlrykgetlppvgvnr 
nri kvtkdg yadi vdvlns plegpdqks f i pverae st flflvg 
qddhnwksefyanbackrlqahgrrkpqiicypetghyieppyf 
plcras lhalvgs p 1 1 wgge prahamaq vdawkq lqt f fhkhlg 
gregtipskv 


6422 


181 


2133 


EG ENLS WF QE FWGD I AKE F YW KT PC PG P FLR YNFD VTKGKI FIE 
WMKGATTNI CYNVLDRUVHEKKLGDKVAFYWEGNEPGETTQIT Y 
HQLLVQ VCQ FS NVLRKQG I HKGDRVAI YMPM I P EL WAMLACAR 
I GALHS I V FAG FSS E SLCER I LDS S CSLL I TTD AF YRGE KLVNL 
KELADEALQKCQEKGF PVRCC I WKHLGRAELGMGDSTS QS P P I 
KRS CPD VQ I S WNQG I DLWWHELMQE AGDE CE PE WCD AED PLF I L 
YTSGSTGKPKGWHTVGGYMLYVATTFKYVFDFHAEDVFWCTAD 
IGWITGHSYVTYGPLANGATSVLFEGIPTYPDVNRLWSIVDKYK 
VTKFYTAPTAIRLLMKFGDEPVTKHSRASLQVLGTVGEPINPEA 
WLWYHRWGAQRCPIVDTFWQTETGGHMLTPLPGATPMKPGSAT 
FPFFGVAPAILNESGEELEGEAEGYLVFKQPWPGIMRTVYGNHE 
R FETTYF KKF PG Y YVTGDG CQRDQDGYYW I TGR I DDMLNVS GHL 
LSTAEVESALVEHEAVAEAAWGHPHPVKGECLYCFVTLCDGHT 
FSPKLTEELKKQIREKIGPIATPDYIQNAPGLPKTRSGKIMRRV 
LRKIAQNDHDLGDMSTVADPSVISHLFSHRCLTIQ 


6423 


614 


1237 


ANLKE I PRDLP PE TVLL YLDSNQ ITS I PNE I FKDLHQLRVLNLS 
KNGIEFIDEHAFKGVAETLQTLDLSDNRIQSVHKNAFNNLKARA 
RIANNPWHCDCTLC2QVLRSMASNHETAHNVICKTSVLDEHAGRP 
FLNAANDADLCNLPKKTTDYAMLVTMFGWFTMVISYWYYVRQN 
QEDARRHLEYLKSLPSRQKKADEPDDISTW 


6424 


1 


1188 


KKVS WP VAAMVHCSCVLFRKYGNF I DKLRLFTRGGSGGMG Y PRL 
GGEGG KGGDVWVVAHNRMTL KQLKDR YPRKRF VAGVGANS K I S A 
LKGS KGKDWE I P VPVG I S VTDENGKI IGELNKENDR ILVAQGGL 
GGKLLTNFLPLKGQKR 1 1 HLDLKL IAD VGLVG F PNAGKS S LLS C 
VSHAKPAIADYAFTTLKPELGKIMYSDFKQISVADLPGLIEGAH 
MNKGMGHKFLKHI ERTRQLLFWDI SGFQLSSHTQYRTAFETI I 
LLTKELELYKEELQTKPALLAVNKMDLPDAQDKFHELMSQLQNP 
KDFLHLFEKNMI PERTVE FQHI IP I S AVTGEG I EELKNCI RKSL 
DEQANQENDALHKKQLLNLW I SDTMSSTEPPS KHAVTTS KMD 1 1 


6425 


1850 


1144 


LAMEGGGG I P LETL KEE SQS RHVLPAS FE VNS LQKSNWGFLLTG 
LVGGTLVAVYAVAT PFVTP ALRKVCL PF VP ATM KQ I ENWKMLR 
CRRGSLVDIGSGDGRIVIAAAKKGFTAVGYELNPWLVWYSRYRA 
WREGVHGS AKF Y I SDLW KVTFS Q YSNWI FGVP QMMLQLE KKL E 
RELEDDARVlACRFPFPHWTPDPTVTGEGrDTVWAYDASTFRGRE 
KRPCTSMHFQLPIQA 


6426 


30 


565 


S RGAAVGGMSVAGGE I RGDTGGE DTAAPGRFS FSPE PTLED I RR 
LHAEFAAERDWEQFHQPRNLLLALVGEVGELAELFQWKTDGEPG 
PQGWSPRERAALQEELSDVLIYLVALAARCRVDLPLAVLSKMDI 
NRRRYPAHLARSS S RKYTELPHGAI S EDQAVGPAD I P CDSTGQT 
ST 


6427 


145 


959 


AAS WGPPHVPKAGKMVSWM I CRLWLVFGMLCPAYAS YKAVKTK 
NIREYVRWMMYWIVFALFMAAEIVTDlFISWFPFYYEIKMAFVIj 
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WLLSPYTKGASLLYRKFVHPSLSRHEKEIDAYIVQAKERSYETV 
LSFGKRGLNIAASAAVQAATKSQGALAGRLRS FSMQDLRS ISDA 
PAPAYHDPLYLEDQVSHRRPPIGYRAGGLQDSDTEDECWSDTEA 
VPRAPARPREKPLIRSQSLRWKRKPPVREGTSRSLKVRTRKKT 
VPSDVDS 


6428 


1982 


444 


SGSGGKMEDHQHVPIDIQTSKLLDWLVDRRHCSLKWQSLVLTIR 
E KI NAA IQDM P ES EE I AQLLSGS Y I HYFHCLR I LDLLKGTEAST 
KNIFGRYSSQRMKDWQEI IALYEKDNTYLVELSSLLVRNVNYEI 
PSXjKKQIAKCQQLQQEYSRKEEECQAGAAEMREQFYHSCKQYGI 
TGENVRGELLALVKDLPSQLAE IGAAAQQSLGEAI DVYQAS VGF 
VCESPTEQVLPMLRFVQKRGNSTVYEWRTGTEPSVVERPHLEEL 
PEQ VAEDAIDWGDFGVEAVS EGTDSG I SAEAAG I DWG I FPESDS 
KDPGGDGIDWGDDAVALQITVLfiAGTQAPEGVARGPDALTLLEY 
TETRNQFLDELMELE I FLAQRAVELS E EADVLS VSQFQLAPAI L 
QGQTKEKMVT^SVLEDLIGKLTSLQLQHLFMlLASPRYVDRVT 
EFLQQ KLKQS Q LLALKKELM VQKQQE ALEE QAALE P KLDL LLEK 
TKELQKLIEADISKRYSGRPVNLMGTSL 


6429 


3413 


3442 


EPSSWTAAPRGPLAAHPLEAAVQEDDRRALSFDSRIKVFANGTL 
WKSVTDKDAGDYLCVARNKVGDDYWLKVD WMKPAKI EHKEE 
NDHKVFYGGDL KVDCVATGLPNPEI S WSLPDGSLVNS FMQSDDS 
GGRTKRYWFNNGTLYFNEVGMREEGDYTCFAENQVGKDEMRVR 
VKWTAPATI RNKTCLAVQVP YGDWTVACEAiCGE PMPKVTWLS 
PTNKVIPTSSEKYQIYQDGTLLIQKAQRSDSGNYTCLVRNSAGE * 
DRKTVW IHVNVQP PKINGNPNP ITTVRE I AAGGS RKLIDCKAEG 
I PT PR VLWAF P EG WLPAP YYGNR 1 TVHGNGS LD I RS LRKSD S V 
QLVCMARNEGfeEARLI VQLTVLEPME KP I FHDPI S EKITAMAGH 
T I S LNCSAAGTP T PS LVWVLPNGTDLQS GQQLQRF YHKADGM LH 
ISGLS S VDAGAYRCVARNAAGHTERL VS LKVGLKPEANKQ YHNL 
VS I INGETLKLPCTPPGAGQGRFSWTLPNGMHLEGPQTLGRVSL 
LDNGTLTVREASVFDRGTYVCRMETEYGPSVTS I PVI VI AYPPR 
ITSEPTPVIYTRPGNWKLNCMAP^GIPKADITWELPDKSHLKAG 
VQARL YGNRFLHPQGS LT I QHATQRDAG FYKCMAKN I LGS DS KT 
TYIHVF 


6430 


1946 


602 


RTRVSTGLRRTLLWSEAVGASSTRGDTGI PGSGEGGAGPGGGEG 
AMLEAMAEPSPEDPPPTLKPETQPPEKRRRTIEDFNKFCSFVIiA 
YAGYI P PS KEE SDWPASGS S SPLRGE SAADSDGWDSAPSDLRT I 
QrFVKKAKSSKRRAAQAGPTQPGPPRSTFSRLQAPDSATLLEKM 
KLKDS LFD LDGPKVAS PLS PTS LTHTS RP PAALT P VPLS QGDLS 
HPPRKKDRKNRKLGPGAGAGFGVLRRPRPTPGDGEKRSRIKKSK 
KRKLKKAERGDRLP P PGP PQAP PSDTDS EEEEEEEEE EEEEEMA 
TWGGEAPVPVLPTPPEAPRPPATVHPEGVPPADSESKEVGSTE 
TS QDGDAS S S EGEMRVMDED I MVESGDD S WDL ITC YCRKP FAGR 
PKI ECS LCGT WIHLS CAKI KKTNVPDFFYCQKCKELRP EARRLG 
GPPKSGEP 


6431 


•a 

•3 


605 


WWNSSYNLPAYAPYLPCEACAMQDGRKGGAYAGKMEATTAGVGR 
LEEEALRRKERLKALREKTGRKDKEDGEPKTKHLREEEEEGEKH 
RELRLRNYVPEDEDLKKRRVPQAKPVAVEEKVKEQLEAAKPEPV 
IEEVDLANLAPRKPDWDLKRDVAKKLEKLKKRTQRAIAELIRER 
LKGQEDS LAS AVDAATEQKT CD S D 


6432 


56 


1692 


GGLGTMGSRIKQNPETTFEVYVEVAYPRTGGTLSDPEVQRQFPE 
D YSDQE VLQTLTKFCFP FYVDSLTVSQ VGQNFTFVLTD I DSKQR 
FG FC RLS SGAKSCFC I LS YLP WFE VFYKLLN I LAD YTTKRQENQ 
WNELLETLHKLPIPDPGVSVHLSVHSYFTVPDTRELPSIPENRN 
LTEYFVAVDVNNMLHLYASMLYERRILI I CSKLSTLTACIHGSA 
AMLYPMYWQHVYIPVLPPHLLDYCCAPMPYLIGIHLSLMEKVRN 
MALDDWI LNVDTNTLETPFDDLQS LPNDVIS SLKNRLKKVS TT 
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TGDGVARAFLKAQAAFFGSYRNALKIEPEBPITFCEEAFVSHYR 
SGAMRQFLGNATOLOLF KOF IDGRLD LLN ^ ava R<3D VPT? twm 

GE YAGSDKLYHQWLS TVRKGSGAI LNTVKTKANPAMKTVYKFD I 
AENGCAPTPEEQLPKTAPS PLVEAKDPKLREDRRPITVHFGQVR 
PPRPHWKRPKSNIAVEGRRTSVPSPEQNTIATPATLHILQKSI 
THFAAKFPTRGWTSSSH 


6433 


1524 


484 


APVTKRKEVFAKDSKGSALDAGRDPKRPALPETLCESGWASNTA 
PTTPPQPGWCLCGPCDFKSSCQTPGREKERRLATMHGSCSFLMLL 
L PLLLLLVATTG P VGALTDEEKRLMVE LHNL YRAQVSP TAS DML 
HMRVTDEEIiAAFAKAYARQ C VWGHNKE RGRRG ENLFAITD EGMDV 
PLAMEEWHHEREHYNLSAATGSPGQMCGHYTQWWAKTERIGCX3 
SHFCEKLQGVEETNIELLVCNYEPPGNVKGKRPYQEGTPCSQCP 
SGYHCKNSLCEPIGSPEDAQDLPYLVTEAPSFRATEASDSRKMG 
AEGPDKPSWSGLNSGPGHVWGPLLGLLLLPPLVLAGIF 


6434 


40 


2002 


MPQLNFGMADPTQMGGLSMLLLAGEHALGTPEVFSGTCRPDVSE 
SPELRQKSPLFQFAEISSSTSHSDASTKQCQTSALFQFAEISSN 
1 toUwwUi J? V KRCGKSALFQXjAEMCLASEGMKMEES kli KAKES 
DGGRI KELEKGKEE KEI KME KTDETRLQKEAE FE KS AKBNLRDS 
KELRNFEALQIDDIMAIKMEDPKEIRKEELEEDHKCSHFPDFSY 
SASSKI I ISDVPSRKDHMCHPHGIMI IEDPAALNKPEKLKKKKK 
KSKMDRHGNDKSTPKKTCKKRQSSESDIESVIYTIEAVAKGDWG 
IEKLGDTPRKKVRTSSSGKGSILDAKPPKKKVKSREKKMSKEKS 
SDTTKESRPPDFISISASKNISGETPEGIKAEPLTPMEDALPPS 
LSGQAKPEDSDCHRKIETCGSRKSERSCKGALYKTLVSEGMLTS 
LRANVDRGKRSSGKGNSSDHEGCWNEESWTFSQSGTSGSKKFKK 
TKPKE D CLLGSAKLDEE FB KKFNS LPQ YS P VTFDRKCVPV P RKK 
K KTGNVS S E PTKTS KGSGD KWSNKQL FLDAI HPT EA I FSEDRNT 
ME P VHKVKN I PS I FNTPEP TTTARTFGGQ P KEKS KENPDYS P CQ 
DTORAG YHHEEVLWMTNTJ^^NNCOn WT .K" OT >P ht a MTrvra 


6435 


2527 


657 


ALQRDAAAAYAHPE YEERFLQEETVS QQ INS IELLQTRPLALPE 
WKSQRPLQRQVHLRGRPASQPTVIRGITYYECAKVSEEENDIEE 
QQDEFFSGDNGVDLL I EDQLLRHNGLMTS VTRRPAATRQGHS TA 
VTS DLNARTAP WS SAL PQPSTSDPS IANHAS VGPTLQTTS VS PD 
PTRES VLQ PS PQVPATT VAHTATQQPAAPAPPAVS PREALMEAM 
HTVPVPPTTVRTDSLGKDAPAGRGTTPASPTLSPEEEDDIRNVI 
GRCKDTLST I TGPTTQNTYGRNEGAWMKDPLAKDER I YVIOTYY 
GN TLVE FRNL ENF KO GRW SNSYKT. P Y W T f5T« mnmrr a c wmd 

AFTRNI I KYDLKQRYVAAWAMLHDVAYEEATPWRWQGHSDVDFA 
VDENGLWLIYPALDDEGFSQEVIVLSKLNAADLSTQKETTWRTG 
LRRNFYGNCFVICGVLYAVDS YNQRNAN I SYAFDTHTNTQIVPR 
LLFENE YF YTTQ I D YNPKDRLL YAWDNGHQVT YHV I FAY 


I 6436 


1295 


341 


GACRPP VRQDPDSG PD YEALPAGATVTTHMVAGAVAG I LEHCVM 
YP I D C VKTRMQ S LQ PD PAARYRNVLEALWR 1 1 RTEGLWRPMRGL 
NVTATGAG PAHAL Y FAC YE KLKKTLS DVI HPGGNSH I ANGAAGC 
VATLLHDAAMNPAE WKQRMQM YNS P YHR VTD CVRAVWQNEGAG 
AFYRS YTTQLTMNVP FQAI HFMT YE FLQEHFNPQRR YNP S S HVL 
SGACAGAVAAAATTPLDVCKTLLNTQESLALNSHITGH I TGMAS 
AFRTVYQVGGVTAY FRGVQARVI YQ I PSTAIAWS VYEFFKYLIT 
KRQEEWRAGK 


6437 


1828 


3*0 . 


PPAPAPPAS PARHVTRTARGHLEGGSRAPPLLQAVFLQ I KNMVK 
LIHTLADHGDDVNCCAFSFSLLATCSLDKTIRLYSLRDFTELPH 
SPLKFHTYAVHCCCFSPSGHIIJ^CSTDGTTVLWNTENGQMLAV 
MEQPSGSPVRVCQFSPDSTCLASGAADGTVVLWNAQSYKLYRCG 
SVKDGSLAACAFS PNGS FFVTGS S CGDLTVWDDKMRCLHS EKAH 
DLG ITCCDFSSQ P VSDGEQGLQF FRLAS OGQDCQVKI WI VS FTH 
Il^FELKYKSTLSGHCAPVLACAFSHDGQMLVSGSVDKSVIVYD 
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TNTENI LHTLTQHTRYVTTCAFAPNTIjLLATGSMDKTVN I WQFD 
LETLCQARSTEHQLKQFTEDWSEEDVSTWLCAQDLKDLVGIFKM 
NNI DG KELLNLTKES LADDLKI E S LGLRS KVLRKI EE LRTKVKS 
LSSGIPDEFICPITRELMKDPVIASDGYSYEKEAMENVTOPAKRN 
RTSPP 


6438 


109 


901 


E VQ I LRAKMFQTGGL I VF YGLLAQTMAQFGGLP VPLDQTL PLNV 
NPALPLSPTGLAGSLTNALSNGLLSGGLLGILENLPLLDILKPG 
GGTSGGLLGGLLGKVTSVIPGLNNIIDIKVTDPQLLELGLVQSP 
DGHRL YVT I PLG I KLQWTP LVGAS LLRLAVKLD I TAE I LAVRD 
KQERIHLVLGDCTHSPGSLQISLLDGLGPLPIQGLLDSLTGILN 
KVL PE LVQGNVC P L VNEVLRGLD I TL VHD I VNML I HGLQ FV I KV 


6439 


23 


412 


SIQTASAITTEMASQSQGIQQLLQAEKRAAEKVADARKRKARRL 
KQAKE EAQMEVEQYRREREHE FQS KQQAAMGS QGNLS AEVEQAT 
RRQVQGMQS SQQRNRERVLAQLLGMVCDVRPQVHPNYR I S A 


6440 


3 


517 


RARWN SDMGDL PG L VRLS I ALR I Q PNDGP VF Y KVDGQRFGQNRT 
IKLLTGSSYKVEVKIKPSTLQVENISIGGVLVPLELKSKEPDGD 
RWYTGTYDTEGVTPTKSGERQPIQITMPFTDIGTFETVWQVKF 
YNYH KRDHCQWG S P FS VI EYE CK PNETRS 1/4 WVNKES FL 


6441 


234 


1373 


KSGGLRRRQRPGRSAAVGE EELP PGMEKFKAAMLLGS VGDALG Y 
RNVCKENSTVGMKI QEELQRSGGLDHLVLSPGE WPVSDNT I MH I 
ATAEALTTDYWCLDDLYREMVRCYVEIVEKLPERRPDPATIEGC 
AQLKPNNYLLAWHTPFNEKGSGFGAATKAMCIGLRYWKPERLET 
L I EVS VE CGRMTHNHPTG FLGSLCTALF VS FAAQ GKPLVQ WGRD 
MLRAVP LAEE YCR KT I RHTAE YQEHWF Y FE AKWQ F YLEER K I S K 
DSENKAIFPDNYDAEEREKTYRKWSSEGRGGRRGHDAPMIAYDA 
LLAAGNSWTELCHRAMFHGGESAATGTIAGCLFGLLYGLDLVPK 
GLYQDLEDKEKLEDLGAALYRLSTEEK 


6442 


34 


796 


AEDPAGGLAGQDTMFARGLKRKCVGHEEDVEGAIiAGLKTVSSYS 
LQRQS LLDMSLVKLQLCHMLVE PNLCRS VLI ANTVRQ IQEEMTQ 
DGTWRTVAPQAAERAPLDRLVSTE I LCRAAWGQEGAHPASGLGD 
GHTQGPVSDLCPVTSAQAPRHLQSSAWEMDGPRENRGSFHKSLD 
QIFETLETKNPSCMEELFSDVDSPYYDLDTVLTGMMGGARPGPC 
EGLEGLAPATPGPSSSCKSDLGELDHWEILVET 


£443 


2 


555 


MAS PAASS VRPPRP KKEPQTLVI PKNAAEEQKLKLERLMKNPDK 
AVP I P E KMS EWAPR P P PE FVRD VMGS S AGAG SGE FHVYRHLRRR 
E YQRQD YMDAMAEKQ KLDAE FQKRL E KNK I AAEE QTAKRRKKRQ 
KLKEKKLLAKKMKLEQKKQEGPGQPKEQGSSSSABASGTEEEEE 
VPSFTMGR 


6444 


390 


899 


GSTPRGKMRAPI PE P KPGDL I E I FRP F YRHWAI YVGDGYWHLiA 
PPSEVAGAGAASVMSALTDKAIVKKELLYDVAGSDKYQVNNKHD 
DKYSPLPCSKIIQRAEELVGQEVLYKLTSENCEHFVNELRYGVA 
RSDQVRDVI IAASVAGMGLAAMSLIGVMFSRNKRQKQ 


6445 


2 


753 


AG AAGAAGAARS PRPQAHTKG VRG LP S RRRS PDCGRME LAAGS F 
S EEQF WEACAELQQPALAGADWQLLVETS G I S I YRLLD KKTG L Y 
EYKVFGVLEDCSPTLLADIYMDSDYRKQWDQYVKELYEQECNGE 
TVVYWEVKYPFPMSNRDYVYLRQRRDLDMEGRKIHVI LARSTSM 
PQLGERSGVIRVKQYKQSLAI ESDGKKGS KVFMYYFDNPGGQ I P 
S W L I NWAAKNG VPNFLKDMARACQNYLKKT 


6446 


1 


1651 


RCPTRSPPPDTPGSRGTTAMCSLASGATGGRGAVENEEDLPELS 
DSGDEAAVJEDEDDADLPHGKQQTPCLFCNRLFTSAEETFSHCKS 
EHQFNIDSMVHKHGLEFYGYIKLINFIRLKNPTVEYMNSIYNPV 
PWEKEEYLKPVLEDDLLLQFDVEDLYEPVSVPFSYPNGLSENTS 
WE KLKHMEARALSAEAALARARE DLQKMKQFAQDFVMHTDVRT 
CSSSTSVIADLQEDEDGVYFSSYGHYGIHEEMLKDKIRTESYRD 
F I YQNPH I FKDKWLDVGCGTG I LSMFAAKAGAKKVLGVDQS EI 
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ID 
NO: 


Predicted 
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amino acid 
residue of 

amino ap-irf 
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Predicted end 
nucleotide 
location 
corresponding 

LO iliSU 

amino acid 
residue of 
amino acid 


Ammo acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, ^Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LYQAMDI IRLNKLEDTITLI KGKIEE VHLPVEKVDVI I SEWMGY 
FLLFESMLDSVLYAKNKYLAKGGSVYPDICTISLVAVSDVNKHA 
DRIAFWDDVYGFKMSCMKKAVIPEAVVEVLDPKTLISEPCX3IKH 
IDCHTTSISDLEFSSDFTLKITRTSMCTAIAGYFDIYFEKNCHN 
RVVFSTGPQSTKTHWKQTVFLLEKPFSVKAGEALKGKVTVHKNK 
KDPRSLTVTLTLNNSTQTYGLQ 


6447 


1554 


1068 


RLGPAEWHLSGPCHATLGAAl^GRALGVRAAWRGAPLCQRVMMP 
SRTNLATGI PSS KVKYSRLSSTDDG YI DLQFKKTPPKI P YKAIA 
LATVLFLIGAFLI I IGSLLLSGYISKGGADRAVPVLI IGILVFL 
PGFYHLR I AY YAS KGYRGYS YDDI PDFDD 


6448 


74 


559 


GQ VLSHC YH YRSS RWRRGGLS RGRGAG VMAL VP YE ETTE FGLQ K 
FHKPLATFS FANHTI QI RQDWRHLGVAAWWDAAI VLST YLEMG 
AVELRGRSAVELGAGTGLVG I VAALLACR I RYERDNNFLAMLER 
QFI VRKVHYDPEKDVH I YEAQKRNQKEDL 


6449 


597 


1876 


EYGVCENLRKLEITGVSCRDVYAKLLHRYRHI LGLWQPD IGPYG 
GLLNWVDGLFIIGWMYLPPHDPHVDDPMRFKPLFRIHLMERKA 
ATVECMYGHKGPHKGHIQIVKKDEFSTKCNQTDHHRMSGGRQEE 
FRTWLRE EWGRTLED I FHEHMQEL I LMKF I YT SQ YDNCLT YRRI 
YLPPSRPDDLIKPGLFKGTYGSHGLEIVMLSFHGRRARGTKITG 
D PN I PAGQQTVE I DLRHR I QL P DLENQRNFNELS RI VLE VRER V 
RQEQQEGGHEAGEGRGRQGPRESQPSPAQPRAEAPSKGPDGTPG 
EDGGEPGDAVAAAEQ PAQCGQGQ P FVLP VGVS S RNED YPRTCRM 
CFYGTGLIAGHGFTSPERTPGVFILFDEDRFGFVWLELKSFSLY 
SRVQATFRNADAPS PQAFDE MLKN I QS LTS 


6450 


848 


269 


FVPAPRTVSG KRS LPGE WE ERGEGEQRTGRE FS GNGGRAVE AAR 
MRLLCGLWLWLS LLKVLQAQTPTPLPLPPPMQS FQGNQFQGEWF 
VLGLAGNSFRPEHRALLNAFTATFELSDDGRFEVWNAMTRGQHC 
DTWSYVLIPAAQPGQFTVDHRVWTHEQAGRPQDQPAGQELVAAS 
RDAGPVHLPGQS SGPLG 


6451 


232 


939 


HSPTPPTSPRASTMEDVKLEFPSLPQCKEDAEEWTYPMRREMQE 
ILPGLFLGPYSSAMKSKLPVLQKHGITHIICIRQNIEANFIKPN 
FQQL FR YLVLD I ADN P VEN 1 1 RFF P MTKEFIDGS LQMGGKVLVH 
GNAG I S RS AAFV I AYIMET FGMKYRDAFAYVQERRFC INPNAGF 
VHQLQ E YEAI YLAKLT I QMMS PLQ I ERS LS VHS GTTGSLKRTHE 
EEDD FGTMQ VATAQNG 


6452 


1 i 


652 


RTRGESSNMEPLAAYPLKCSGPRAKVFAVLLSIVLCTVTLFLLQ 
LKFLKP KINS F YAFEVKDAKGRTVS LEK YKGKVS L WNVAS DCQ 
LTDRNYLGLKELHKEFGPSHFSVLAFPCNQFGESEPRPSKEVES 
FARKNYGVTFPIFHKIjKILGSEGEPAFRFLVDSSKKEPRWNFWK 
YLVNP EGQWKFWRPEEP I EVIRPD IAALVRQVI I KKKEDL 


6453 


827 


223 


HRRWL PGLS MS P RRTLPRPLS LCLS LCLCLCLAAALGSAQS GS C 
RDKKNCKVVFSQQELRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCX5TPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FSYGMHRVETSCSOCGAHT.nwTPnnnDBD'minjvr'TKTGaTiT cc«p 

PADSSGTAEGGSGVAS PAQADKAEL 


6454 


827 


223 


HRRWLPGLSMSPRRTLPRPLSLCLSLCLCLCLAAALGSAQSGSC 
RDKKNCKWFSQQE LRKRLTP LQ YHVTQEKGTESAFEGE YTHHK 
D PG I YKC WCGTPL F KS ETKFDSGSG WPS FHD VINS EAI TFTDD 
FSYGMHRVETSCSQCGAHLGHIFDDGPRPTGKRYCINSAALSFT 
PADSSGTAEGGSGVAS PAQADKAEL 


6455 


1042 


173 


RVHLATVSASAAWDALGLPVRSHMQGSTRRMGVMTDVHRRFLQL ' 
LMTHGVLEEWDVKRLQTHCYKVHDRNATVDKLEDFINNINSVLE 
SLY IEI KRG VTEDDGRP I YALVWLATTS ISKMATDFAENELDLF 
RKALELI IDSETGFASSTNILNLVDQLKGKKMRKKEAEQVLQKF 
VQNKWLIEKEGEFTLHGRAILEMEQYIRETYPDAVKICNICHSL 



506 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, RoArginine, 
S=Serine t T»Threonine, V»Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIQGQSCETCGIRMHLPCVAKYFQSNAEPRCPHCNDYWPHEIPK 
VFDPEKERESGVLKSNKKSLRSRQH 


6456 


2 


555 


R PQSRS I SMWRNSLLQVSSGLRWLRVCAMVDI LGERHLVTCKGA 
TVEAEAALQNKWALYFAAARCAPSRDFTPLLCDFYTALVAEAR 
RPAPFEWFVSADGSSQEMLDFMRELHGAWLALPFHDPYRHELR 
KRYNVTAI PKLVIVKQNGEVTTNKGRKQIRERGLACFQDWVEAA 
DIFQNFSV 


\ 6457 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
I ILGKQYSLNIILSVFAIILGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLFYNACFMI I PTLI ISVSTG 
DLQQATE FNQW KNW F I LQ FL LS C FLGFLLM YS TVLCS Y YNS AL 
TTAWGA I KNVSVAY IGILIGGDYIFS LLNFVGLN I CMAGGLR Y 
S FLTLS SQLKPKPVGEEN I CLDLKS 


6458 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
I ILGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
IFTAANGVYTKQKMDPKELGKYGVLFYNACFMI IPTLI ISVSTG 
DLQQATEFNQWKNWFILQFLLS CFLGFLLMYS TVLCS YYNSAL 
TTAWGA I KNVSVAY I G I L I GGD Y I FSLLNFVGLNI CMAGGLR Y 
S FLTLS SQLKPKPVGEEN I CLDLKS 


6459 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
I ILGKQYSLNI I LSVFAI I LGAFI AAGSDLAFNLEGY I FVFLND 
IFTAANGVYTKQKMDPKELGKYGVLFYNACFMI IPTLI ISVSTG 
DLQQATEFNQWKNWF I LQFLLSCFLGFLLMYSTVLCS YYNSAL 
TTAWGAI KNVSVAY IGI L IGGDY I FS LLNF VGLNI CMAGGLRY 
SFLTLSSQLKPKPVGEENICLDLKS 


6460 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKI IHFPDFDKKI PV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
I ILGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANG VYTKQ KMD P KELG K YGVL FYNACFM I I P TL I ISVSTG 
DLQQATE FNQWKNWF I LQFLLSCFLGFLLMYS TVLCS YYNSAL 
TTAWGAI KNVSVAY IGI L I GGDY I FS L LNFVG LN I CMAGGL R Y 
SFLTLSSQLKPKPVGEENICLDLKS 


6461 


1653 


360 


LQQRTLRITAVGQTHPIAWMAWEPSLGAFYGPASFITFVNCMYF 
LSIFIQLKRHPERKYELKEPTEEQQRLAANENGEINHQDSMSLS 
LISTSALENEHTFHSQLLGASLTLLLYVALWMFGALAVSLYYPL 
DLVFS FVFGATS LS FSAFFWHHCVNREDVRLAW IMTCCPGRSS 
YS VQ VNVQ PPNSNGTNGE AP KC PNS S AE S S CTNKS AS S FKNS S Q 
GCKLTNLQAAAAQCHANSLPLNSTPQLDNSLTEHSMDNDIKMHV 
APLEVQFRTNVHSSRHHKNRS KGHRASRLTVLREYAYDVPTS VE 
GSVQNGLPKSRLGNNEGHSRSRRAYLAYRERQYNPPQQDSSDAC 
S TLPKS S RNFEKP VS TTS KKDALRKPAWELENQQKS YGLNLAI 
QNGP I KS NGQEG PLLGTDS TGNVRTGLWKHETTV 


6462 


3 


773 


S EELDREKKLKEDS PRKTPNKESGVPS L P VS LTS I KEEPKEAKH 
PDSQSMEESKLKNDDRKTPVNWKDSRGTRVAVSSPMSQHQSYIQ 
YLHAYP YPQMYDPSHPAYRAVS PVLMHS YPGAYLS PGFHYPVYG 
KMSGREETEKVNTSPSVNTKTTTESKALDLLQQHANQYRSKSPA 
PVEKATAEREREAERERDRHSPFGQRHLHTHHHTHVGMGYPLI P 
GQ YDP FQG LTSAAL VASQQ VAAQAS ASGM FPGQRRE 


6463 


2 


350 


VILCILGGWIFKNADRSMEKKKGEPRTRAEARPWVDEDLKDSSD 
LHQAEEDADEWQESEENVEHIPFSHNHYPEKEMVKRSQEFYELL 
NKRRSVRFISNEQVPMEVIDNVIRTAGL 


6464 


12 


1154 


G I LRQKE RE ERNR I H KKE I LFLEHLL W PS EMSS LSG KVQTVLG 
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Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, DoAspartic Acid, E=* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








LVEPSKLGRTLTHEHLAMTFDCCYCPPPPCQEAISKEPIVMKNL 
YWIQKNAYSHKENLQLNQETEAI KEELLYFKANGGGALVENTTT 
G I SRDTQTLKRLAEETGVHI I SGAGFYVDATHSSETRAMS VEQL 
TDVLMNE I LHGADGTS I KCG I IGE IGCSWPLTESERKVLQATAH 
AQAQLGCPVIIHPGRSSRAPFQIIRILQEAGADISKTVMSHLDR 
T I LDKKELLE FAQLGC YLEYDLFGTELLH YQLGPDIDMPDDNKR 
I RRVRLLVE EG CE DR I LVAHD I HTKTRLM KYGGHG YSH I LTNW 
PKMLLRGITENVLDKILIENPKQWLTFK 


5465 


126 


1396 


KMTVFFKTLRJJHWKKTTAGLCLLTWGGHWLYGKHCDyLLRRAAC 
QEAQVFGNQLI P PNAQVKKATVFLNPAACKG KARTL FE KNAAP I 
LHLSGMDVTIVKTDYEGQAKKLLELMENTDVIIVAGGDGTLQEV 
VTGVLRRTDEATFSKIPIGFIPLGETSSLSHTLFAESGNKVQHI 
TDATLAIVKGETVPLDVLQIKGEKEQPVFAMTGLRWGSFRDAGV 
KVS KYWYLE P LK I KAAH F FS TLKE W PQTHQAS I S YTGP TERP PN 
E PEETPVQRPS LYRR I LRRLAS YWAQ PQDAL SQE VS PE VWKDVQ 
LSTIELSITTRNNQLDPTSKEDFLNICIEPDTISKGDFITIGSR 
KVRN P KLHVEG TECLQAS QCTLL I P EGAGGS FS I DSE E YEAMP V 
EVKLLPRKLQFFCDPRKREQMLTS PTQ 


6466 


1134 


828 


VARGTEL S QL E KAH P P ADMGRRKS KR KPP P KKKMTGTLETQ FTC 
P FCNHE KS CD VKMDRARNTGV I S CTVCEjEE FQT P I T YLS E P VD V 
YS D W I DACEAANQ 


6467 


301 


2571 


GELRVLALAHGELACHAVLTASLLSLRSRLMDSDMDYERPNVET 
I KCVWGDNAVGKTRL I CARACNATLTQ YQLLATHVPTVWAI DQ 
YRVCQ E VLE R S RDWD D VS VS LRLWDT FGDHH KDRR FAYGRS DV 
WLCFS IANPNS LHHVKTMWYPE I KHFCPRAP VI LVGCQLDLRY 
ADIiEAVNRARRPIiARP I KPWE I L PP E KGREVAKELG I P YYE TS V 
VAQFG I KDVFDNAI RAAL ISRRHLQFWKSHLRNVQRPLLQAPFL 
PPKPP PP 1 1 WPDPPSSSEECPAHLLEDPLCADVI LVLQERVRI 
FAHK I YLSTS S S KFYDLFLMDLS EG ELGGPS E PGG THP EDHQGH 
SDQHHHHHHHHHGRDFLLRAASFDVCES VDEAGGSG PAGLRAST 
SDGI LRGNGTG YLPGRGRVLS SWSRAFVS IQEEMAEDPLTYKSR 
LMVWKMDS S I QPGPFRAVLKYLYTGELDENERDLMHIAH I AEL 
LE VFDLRMMVAN I LNNE AFMNQE I TKAFHVRRTNR VKE CLAKGT 
FSDVTFI LDDGT I SAHKPLL ISSCDWMAAMFGG PFVESSTRE W 
FPYTSKSCMRAVLEYLYTGMFTSSPDLDDMKLIILANRLCIjPHL 
VALTEQYTVTGLMEATQMMVDIDGDVLVFLELAQFHCAYQLADW 
CLHHICTNYNNVCRKFPRDMKAMSPENQEYFEKHRWPPVWYLKE 
EDHYQRARKEREKEDYLHLKRQPKRRWLFWNSPSSPSSSAASSS 
SPSSSSAW 


6468 


3 


1374 


DAWAGTNMAALAPVGSPASRGPRLAAGLRLLPMLGLLQLLAEPG 
LGRVHHLALKDD VRHKVHI^TFGFFKDGYMWNVS SLSLNE PED 
KD VT I GF S LDRTKNDGFS S YLDEDVN YC I LKKQS VS VTLL I LD I 
SRSEVRVKSPPEAGTQLPKIIFSRDEKVLGQSQEPNVNPASAGN 
QTQKTQDGGKSKRSTVDSKAMGEKSFSVHNNGGAVSFQFFFNIS 
TDDQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
GE IP L PKL Y I SMAF FFFLSGT I W I H I LR KRRND VF KI HWLMAAL 
PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGALLF 
I TI AL I GTG WAF I KH ILSDKDKK I FM I VI PRRVLANVA Y HIES 
TE EGTTE YGLWKDS L FLVDLLC CG A I LF P WWS I RHLQEAS ATD 
GKGKFSRAHFVLLSLL 


6469 


3 


1374 


DAWAGTNMAALAPVGSPASRGPRIiAAGLRLLPMLGLLQLLAEPG 
LGRVHHLALKDDVRHKVHLNTFGFFKDGYMVVNVSSLSLNEPED 
KDVTIGFSLDRTKNDGFSSYLDEDVNYCILKKQSVSVTLLILDI 
SRS EVR VKS PPEAGTQL PK 1 1 FS RDEKVLGQS QE PNVNP AS AGN 
QTQKTQDGGKSKRSTVDSKAMGEKSFSVHNNGGAVSFQFFFNIS 
TDDQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
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Glutamic Acid, F^Phenylalanine, G=Glycine, 
H«Histidine, I=l3oleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W tryptophan, YaTyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEIPLPKLYISMAFFFFLSGTIWIHILRKRRNDVFKIHWLMAAL 
PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGALLF 
ITIALIGTGWAFIKHILrSDKDKKrFMIVIPRRVIiANVAYIIIES 
TEEGTTEYGLWKDSLFLVDLLCCGAILFPWWSIRHLQEASATD 
GKGKFSRAHFVLLSLL 


6470 


2726 


1437 


AAASGVSSRADAPVLAQSPASAGNGRPSTPRVPGSRRHPSAPRS 
GPLPREDGCRTPGPQIiLPLPGALLRPRTLLSSAAETGRSRHPDT 
QHPSSGGRCRGGTESPSSAAGRPASMAEAEEDCHSDTVRADDDE 
ENES PAETDIiQAQLQMFRAQWMFELAPG VSS SNLENRPCRAARG 
S LQKTSADTKGKQEQAKEEKARELFLKAVEEEQNGALYEAI KFY 
RRAMQLVPDIEFKITYTRSPDGDGVGNSYIEDNDDDSKMADLLS 
YFQQQLTFQESVLKLCQPELESSQIHISVLPMEVLMYIFRWWS 
SDLDLRSLEQLSLVCRGFYICARDPEIWRLACLKVWGRSCIKLV 
PYTS WREM FLERPRVR FDG VY I S KTTYIRQGEQ S LDGF YRAWHQ 
VEYYRYIRFFPDGHVMMLTTPEEPQSIVPRLRTR 


6471 


1750 


299 


FFFDKMAAGGSGVGGKRSSKSDADSGFLGLRPTSVDPALRRRRR 
G P RNKKRGWRRIiAQE PLG LE VDQ FLEDVRLQERTSGG LLS EAPN 
E KLF FVDTGS KEKGLTKKRTKVQKKS LLLKK P LR VDL I LENTS K 
VP AP KDVLAHQ VPNAKKLRRKEQL WE KLAKQGE L PRE VRRAQAR 
LLNPS ATRAKPGPQDT VER P F YD LWAS DNPLDR PLVGQDE FFLE 
QTKKKGVKRPARLHTKPSQAPAVEVAPAGASYNPSFEDHQTLLS 
AAHEVELQRQKEAEKLERQLALPATEQAATQESTFQELCEGLLE 
ESDGEGEPGQGEGPEAGDAEVCPTPARLATTEKKTEQQRRRSKA 
VHRLR VQQAALRAARLRHQ E LFRLRG I KAQVALRLAELARRQ RR 
RQARREAEADKPRRLGRLKYQAPD I DVQLSS ELTDSLRTLKP EG 
NILRDRFKSFQRRNMIEPRERAKFKRKYKVKLVEKRAFREIQL 


6472 


3 


897 


SCGSDRAQWAMEFPFDVDALFPERITVLDQHLRPPARRPGTTTP 
ARVDLQQQIMTIIDELGKASAKAQNLSAPITSASRMQSNRHWY 
ILKDSSARPAGKGAIIGFIKVGYKKLFVLDDREAHNEVEPLCIL 
DFYIHESVQRHGHGRELFQYMLQKERVEPHQLAIDRPSQKLLKF 
LNKHYNLETTVPQVNNFVIFEGFFAHQHRPPAPSLRATRHSRAA 
AVDPTPAAPARKLP PKRAEGD I KPYSS SDREFLKVAVEPP WPLN 
RAPRRATPPAHPPPRSSSLGNSPERGPLRPFVP 


6473 


22 


912 


S S AVE F VWEGE KMAAE PNKTE I QTLF KR LRAVP TNKACFD CGAK 
NPSWASITYGVFLCIDCSGVHRSLGVHLSFIRSTELD3NWNWFQ 
LRCMQVGGNANATAFFRQHGCTANDANTKYNSRAAQMYREKIRQ 
LGSAALARHGTDLWIDNMSSAVPNHSPEKKDSDFFTEHTQPPAW 
DAPATEPSGTQQPAPSTESSGLAQPEHGPNTDLLGTSPKASLEL 
KSS 1 1 G KKKPAAAKKGLGAKKGLGAQKVSSQS FS E I ERQAQVAE 
KLREQQAADAKKQAEESMVASMRLA YQELQ I DR 


6474 


3 


462 


LQRQRQHPAAAPAVP VRCFTFC FTD I V I M PKRKS PENTEGKDG S 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKE E KQEAG KEGTAPS ENGE TKAEE IH I SR£ TVNVS TSRG T P 
P S TLS VKGQ I ETVRVKGTEN 


6475 


3 


462 


T«PjRPjR nWP A 21 VP VP P ptt? r* STn tvt wdvdv c t5t?m TDnvn'n e» 

KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEEKQEAGKEGTAPSENGETKAEEIHISRSTVNVSTSRGTP 
PSTLSVKGQIETVRVKGTEN 


6474 


106 


1090 


ARAMAQYKGTMREAGRAMHLLKKRERQREQME VLKQR I AEET I L 
KSQVDKRFSAH YDAVEAELKSS TVGLVTLNDMKARQEALVRERE 
RQLAKRQHLEEQRLQQERQRBQEQRRERKRKISCLSFALDDLDD 
QADAAEARRAGNLGKNPDVDTS FLPDRDREEEENRLREELRQEW 
EAQREKVKDEEMEVTFSYWDGSGHRRTVRVRKGNTVQQFLKKAL 
QGLRKDFLELRSAGVEQLMFIKEDLILPHYHTFYDFIIARARGK 
SGPLFSFDVHDDVRLLSDATMEKDE3HAGKWLRSWYEKNKHIF 
PAS RWE A YD P EKKWD KYTIR 
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L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine # R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Yr=Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 


6477 


227 


915 


LQGHLMG I MAAS RP LS R FW EWG KN I VCVGRNYADH VREMRS AVL 
SEPVLFLKPSTAYAPEGSPILMPAYTRNLHHELELGWMGKRCR 
AVPE AAAMD YVGGYALCIiDMTARD VQDE CKKKGLPWTLAKS FTA 
S CP VS AF VP KE KI PD PHKL KLWLKVNGE LRQEGETS SM I FS I P Y 

IISYVSKIITLEEGDIILTGTPKGVGPVKENDEIEAGIHGLVSM 
TFKVEKPEY 


6478 


2 


1495 


FVSSRILPESLASSEASTLEAMGRKEEDDCSSWKKQTTNIRKTF" 
I FMEVLGSGAFSE VFLVKQRLTGKLFALKCI KKS PAFRDSS LEN 
EIAVLKKIKHENIVTLEDIYESTTHYYLVMQLVSGGELFDRILE 
RGVYTEKDASLVIQQVLSAVKYLHENGIVHRDLKPENLLYLTPE 
ENSKIMITDFGLS KMEQNG IMSTACGTPGYVAPEVLAQKP YS KA 
VDCWSIGVITYILLCGYPPFYEETESKLFEKIKEGYYBFESPFW 
DDISESAKDFICHLLEKDPNERYTCEKALSHPWIDGNTALHRDI 
YPSVSLQIQKNFAKS KWRQAFNAAAWHHMRKLHMNLHS PGVRP 
E VENRP PE TQ AS ETSRPSSPEITI TE AP VLDHS VALPALTQL P C 
QHGRRPTAPGGRS LNCLVNGS LHISSS LVPMHQGS LAAG PCGCC 
SSCLNIGSKGKSSYCSEPTLLKKANKKQNFKSEVMVPVKASGSS 
HCRAGQTGVCLIM 


6479 


3 


949 


SCRGPGWHPAGGQAGAMELLSALSLGELALSFSRVPLFPVFDLS 
YF I VS I L YL KYE PGAVELS RRHP IAS WLCAMLHCFG S Y I LADLL 
LGE PL I D YFSNNS S I LLAS AVW YL I F FC PLDLF YKC VC FLP VKL 
I FVAMKE WRVRK IAVG I HHAHHH YHHG WFVM I ATGWVKGS GVA 
LMSNFEQLLRGVWKPETNEILHMSFPTKASLYGAILFTLQQTRW 
LPVS KAS L I F I FTLFMVS CKVFLTATHS HS S P FDALEG Y I C P VL 
FGS ACGGDHHHDNHGGSHSGGG PGAQH S AMPAKS KEE LS EG S RK 
KKAKKAD 


6480 


192 


514 


DFMSIYFPIHC PD YLRSAKMTEVMMNTQPMEE IGLS PRKDGLS Y 
QIFPDPSDFDRCCKLKDRLPSIWEPTEGEVESGELRWPPEEFL 
VQEDEQDNCEETAKENKEQ 


6481 


110 


1131 


KSRMDLDWNMFVI AGGTLAI P I LAFVASFLLWPSALI RI YYWY 
WRRTLGMQVRYVHHED YQFCYS FRGRPGHKPS I LMLHGFSAHKD 
MWLSWKFLPKNLHLVCVDMPGHEGTTRSSLDDLSIDGQVKRIH 
Q FVECLKLNKKPFHLVGTSMGGQVAGVYAAYYPS DVS SLWLVCP 
AGLQYSTDNQFVQRLKELQGSAAVEKIPLIPSTPEEMSEMLQLC 
SYVRFKVPQQILQGLVDVRIPHNNFYRKLFLEIVSEKSRYSLHQ 
NMDK I KVP TQ 1 1 WGKQDQ VLD VS GADMLAKS I ANCQVE LLENCG 
HSWMERPRKTAKLI I DFLAS VHNTDNNKKLD 


6482 


2517 


568 


E P VS KVS QS RRKAGVP TAN I EE S QAVEAAMANVP WAE VCEKFQA 

ALALSRVELHKNPEKEPYKSKYSARALLEEVKALLGPAPEDEDE 

RPEAEDGPGAGDHALGLPAEWEPEGPVAQRAVRLAVIEFHLGV 

NHI DTEELSAGEEHLVKCLRLLRRYRLSHDCI SLC IQAQNNLG I 

LWSEREEIETAQAYLESSEALYNQYMKEVGSPPLDPTERFLPEE 

E KLTEQERS KRFE KVYTHNLY YLAQ VYQHLEM FEKAAHYCHS TL 

KRQLEHNAYHPIEWAINAATLSQFYINKLCFMEARHCLSAANVI 

rvj^A^^xoAiiiui^JtSA&^bVFbJjYHQRKGEIARCWIKY 

NAQLSMQDNIGELDLDKQSELRALRKKELDEEESIRKKAVQFGT 

GELCDAISAVEEKVSYLRPLDFEEARELFLLGQHYVFEAKEFFQ 

IDGYVTDHIEWQDHSALFKGLAFFETDMERRCKMHKRRIAMLE 

PLTVDLNPQ Y YLL VNRQ IQ FE I AHAYYDMMDLKVA I ADRLRDPD 

SHIVKKINNLNKSALKYYQLFLDSLRDPNKVFPEHIGEDVLRPA 

MLAKFRVARLYGKIITADPKKELENLATSLEHYKFIVDYCEKHP 

EAAQEIEVELELSKEMVSLLPTKMERFRTKMALT 


6483 


3 


623 


NSHLLCGLRARAPLSANGREARAMEQRLAEFRAARKRAGLAAQP 
PAASQGAQTPGEKAEAAATLKAAPGWLKRFLVWKPRPASARAQP 
GLVQEAAQ PQGS TS ETPWNTAI PLP S CWDQS FLTNI T FLKVLLW 
LVLLGLFVELEFGLAYFVLSLFYWMYVGTRGPEEKKEGEKSAYS 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
■u— ucutiuc, n-rjBuuonine» iN=Asparagme, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








VFNPGCEAIQGTLTAEQLERELQLRPLAGR 


6484 


201 


965 


QLAVKTKMSGLRPGTQVDPEIELFVKAGSDGSSIGNCPFCQRLF 
M I LWLKGVKFNVTT VDMTRKPEE LKDLAPGTNPPFLVYNKE L KT 
up J.Klc.Bcijr«y iijAPPRiPHLbPKYKESFDVGCNLFAKFSAYIK 
NTQKEANKNFEKSLLKEFKRLDDYLNTPLLDEIDPDSAEEPPVS 
RRLFLDGDQLTLADCS LL PKLNI I KVAAKKYRDFD I PAE FSGVW 
RYLHNAYAREEFTHTCPEDKEIENTYANVAKQKS 


6485 


6 


1091 


FVDLVRAVEFLPCPDSQKLEKECQSSEESMGSNSMRSILEEDEE 
Ur.r.i'PK Vliij X HiifKo r b VtaMLi VWtiKHKKYPFW PAWKS VRQRDK 
KASVLYIEGHMNPKMKGFTVSLKSLKHFDCKEKQTLLNQAREDF 
NQDIGWCVSL I TDYRVRLGCGS FAGS FLE YYAADI S YPVRKS I Q 
QDVLGTKLPQLSKGSPEEPWGCPLGQRQPCRKMLPDRSRAARD 
RANQKLVE Y I GKAKGAE SHLRAI LKS RKP S RWLQTFLS S S Q YVT 
CVETYIiEDEGQLDL WKYLQGVYQEVGAKVLQRTNGDR I RFILD 
VLLPEAIICAISAGDEVDYKTAEEKYIKGPSLSYREKEIFDNQL 
LEERNRRRR 


6486 


10 


581 


LVLQAGGAHLS P S RVTQG I Y YM LAFS EM P KP P D YS ELSD S LTLA 
GGTGRFS G PLHRAWRMMNFRQRMG W I GVG L Y L LAS AAAF Y YVFE 
ISETYNRLALraiCX3HPEEPLEGTTWTHSLKAQLLSLPFWVWTV 
I FLVP YLQMFLFLYSCTRADPKTVGYCI IPI CLAVICNRHQAFV 
KASNQ I SRLQL IDT 


6487 


352 


863 


SFLKPLRGKMSVTLHTDVGDIKIEVFCERTPKTCENFLALCASN 
YYNGCIFHRNIKGFMVQTGDPTGTGRGGNSIWGKKFEDEYSEYL 
KHNVRGWSMANNGPNTNGSQFFITYGKQPHLDMKYTVFGKVID 
GLETLDELEKLPVNEKT YRPLNDVHI KDITIHANPFAQ 


6488 


878 


241 


TALQEFGTSGPPLSLRFALPSGTGRFKPLPGARGPSWPPSPRVP 
ME P PNL YP VKL YVYDLS KG LARRLS P I MLGKQLEG I WHTS I WH 
KDEFFFGSGG I S S CPPGGTLLGPPDS WDVGS TEVTEEI FLE YL 
SSLGESLFRGE AYNLFEHNCNTFS NEVAQFLTGRKI PS Y ITDLP 
SEVLSTPFGQALRPLLDSIQIQPPGGSSVGRPNGQS 


6489 


1457 


375 


KVAKMATALSEEELDNEDYYSLLNVRREASSEELKAAYRRLCML 
YHPDKHRDPELKSQAERLFNLVHQAYEVLSDPQTRAIYDIYGKR 
GL EMEGWE WE RRRTPAE I RE E FERLQRERE E RRLQQRTNP KGT 
ISVGVDATDLFDRYDEEYEDVSGSSFPQIEINKMHISQSIEAPL 
TATDTAI LSGSLSTQNGNGGGS INFALRRVTSAKGWGELEFGAG 
DLQGP LFGLKL FRNLTP RCF VTTNCALQFS SRG I RPGLTTVLAR 
NLDKNTVGYLQWHCS S PLLQVQRPHRWTRACAPE PS FRPFLHVP 
TWDAECSGARTPSTAWTSAAVKLREACLSGPGSGSHQLLLLTPR 
SKRRTGGG 


6490 


3 


1183 


HEAGCE VWt/5 YGP RAAAAAAATVLFGGAGPTE TM F VARS IAADH 
KDL I HDVS FDFHGRRMATCS SDQS VKVWDKSE SGD WHCTAS WKT 
HSGS VWRVTWAHPE FGQVLASCSFDRTAAVWEE I VGE SNDKLRG 
QSHWVKRTTLVDSRTSVTDVKFAPKHMGLMLATCSADGIVRIYE 

DSS PNAMAKVQ I FE YNENTRKYAKAETLMTVTDPVHD IAFAPNL 
GRSFHILAI ATKDVRIFTLKPVRKELTS SGGPTKFE IHIVAQFD 
NHNS QVWRVS WN I TGTVLAS SGDDGCVRLWKANYMDNWKCTG I L 
KGNGS PVNGSSQQGTSNPSLGSNI PS LQNSLNGS SAGRKHS 


6491 


3 


1183 


HEAGCE VWLGYG PRAAAAAAATVLFGGAGPTETMFVARS IAADH 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDWHCTASWKT 
HSGSVWRVTWAHPEFGQVLASCSFDRTAAVWEEIVGESNDKLRG ' 
QSHWVKRTTLVDSRTS VTDVKFAP KHMGLMLATCSADGI VR I YE 
APDVMNLSQWSLQHEISCKLSCSCISWNPSSSRAHSPMIAVGSD 
DSSPNAMAKVQI FEYNENTRKYAKAETLMT VTDPVHD IAFAPNL 
GRSFH I LAI ATKDVR I FTLKP VRKELTSSGGPTKFE I HIVAQFD 
NHNSQVWRVS WNI TGTVLAS SGDDGCVRLWKANYMDNl^KCTG I L 
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Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=asoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Aspar agine , 
P=Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y«Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








KGNGS P VNGS S QQGTSN P S LGSN I P S LQNS LNGS SAGRKHS 


6492 


34 


2573 


I P FLKS CCCC CL FDF P P P PLDQ VQE E E CEVE R VTEHGTPKP FRK 
FDS VAFGESQ S EDEQFENDLETDPPNWQQLVS REVLLGLKPCEI 
KRQE VI NE LF YTERAHVRTL KVLDQ VF YQRVS REG I LS PS ELRK 
IFSNLEDILQLHIGLNEQMPCAVRKRNETSVIDQIGEDLLTWFSG 
PGEEKLKHAAATFCSNQPFALEMIKSRQKKDSRFQTFVQDAESN 
PLCRRLQLKDIIPTQMQRLTKYPLLLDNIATYTEWPTEREKVKK 
AADHCRQILNYVNQAVKEAENKQRLEDYQRRLDTSSLKLSEYPN 
VE E LRNbDLTKR KMI HEGPLVWKVNRDKT IDL YTLLLE D I LVLL 
QKQDDRL VLRCHS KI LASTADS KHT FS P VI KL S T VLVRQ VATDN 
KAL FV I S MS DNGAQ I YE L VAQTVS E KTVWQDL I CRMAAS VKEQS 
TKP I PLPQSTPGEGDNDEEDPSKLKEEQHGISVTGLQSPDRDLG 
LE S TL I S S KP QS HS LSTS GKS E VRDLFVAERQ F AKEQHTDGT LK 
EVGEDYQIAIPDSHLPVSEERWALDALRNLGLLKQLLVQQLGLT 
EKSVQEDWQHFPRYRTASQGPQTDSVIQNSENIKAYHSGEGHMP 
FRTGTGD IATCYS PRTSTES FAPRDS VGLAPQDSQASNILVMDH 
MIMTPEMPTMEPEGGLDDSGEHFFDAREAHSDENPSEGDGAVNK 
EE KD VNLRI SGNYLI LDGYDP VQES S TDEE VASS LTLQPMTG I P 
AVE S THQQQHS PQNTHSDGAI S PFTPEFLVQQR WGAME YS CFE I 

QSPSSCADSQSQIMEYIHKIEADLEHLKKVEESYTILCQRLAGS 
ALTDKHSDKS 


6493 


557 


1147 


TPARMAYQGSSTSDCMSKTLDSASAHFAASAWSAPVPSRSEVA^ 
KEQNTGHNNING WQPSGTS KTLYSTNMALSSS PG IS AVQLVRT 
VGHTTTNHLIPALCTSSPQTLPMNNSCLTNAVHLNNVSVVSPVN 
VH INTRTS APS P TALKLATVAAS MDR VPK VTPS S AI S S IARENH 
EPERLGLNGIAETTVAMEVT 


6494 


2425 


1052 


AVAGGARPCSTPSS PHRRCRRHRPR PLPRPPAAI MSAS AVY VLD 
LKGKVLICRNYRGDVDMSEVEHFMPILMEKEEEGMLSPILAHGG 
VRFMW I KHNNLYLVATS KKNAC VSLVFS FLYKWQ VFSEYFKEL 
SEES I RDNF V 1 1 YELLDELMD FG YPQTTDS KI LQE YI TQEGHKL 
ETGAPR P PATVTNAVS WRS EG I KYRKNEVFLDVI ESVNLLVSAN 
GNVLRS E I VGS I KMRVFLSGMPELRLGLNDKVLFDNTGRGKS KS 
VELEDVKFHQCVRLSRFENDRT IS F I P PDGEFELMS YRLNTHVK 
PLIWIESVIEKHSHSRIEYMIKAKSQFKRRSTANNVEIHIPVPN 
DADS P K F KTT VGS VKWVPENS E I VWS I KS FPGGKE YLMRAHFGL 
PS VEAEDKE GKPP I S VKFE I PYFTTSG I QVRYLK I I EKSGYQAL 
P WVRY I TQNGD YQLRTQ 


6495 


2425 


1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAAIMSASAVYVLD 
LKGKVL I CRN YRGDVDMSEVEHFMPILMEKEEEGMLS P I LAHGG 
VRFMW I KHNNLYLVATS KKNACV S L VFS FLYKWQ VFSEYFKEL 
EEES IRDNFVI I YELLDELMDFG YPQTTDS K I LQE Y I TQEGHKL 
ETGAP R P PATVTNAVS WRS EG I KYRKNEVFLDVI ESVNLLVSAN 
GNVLRSEIVGSIKMRVFLSGMPELRLGLNDKVLFDNTGRGKSKS 
VELEDVKFHQCVRLS R FENDRT I S F I P PDGE FELMS YRLNTH VK 
PLIWIESVI EKHSHSR I E YM I KAKS Q FKRRS TANNVE IH I P VPN 
DADS PKFKTTVGS VKWVPENS E I VWS IKS FPGGKEYLMRAHFGL 
PSVEAEDKEGKPPISVKFEIPYFTTSGIQVRYLKIIEKSGYQAL 
P WVRY I TQNGD YQLRTQ 


6496 


247 


559 


LRAVSLLPLQLVLPBYSIHSLFCIMFLCAQEWLTLGLNVPLLFY 
HFWRYFHCPADSSELAYDPPVVMNADTLSYCQKBAWCKLAFYLL 
SFFYYLYCMIYTLVSS 


64 97 


1053 


352 


ANTQICRLCPRRHLHPPCGAKMGNGTEEDYNFVFKWLIGESGV 
GKTNLLSRFTRNEFSHDSRTTIGVEFSTRTVMLGTAAVKAQIWD 
TAG LER YRAI TS AYYRGAVGALLVFDLTKHQT YAWERWLKE LY 
DHAE AT I WMLVGNKS DLS Q ARE VPTE EARM FAENNGLLFLE TS 
ALD S TNVELAFETVLKE I FAKVS KQRQNS IRTNAI TLGSAQAGQ 
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Amino acid segment containing signal peptide 
(A*»Alanine, OCysteine, D=Aspartic Acid, E» 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparaglne , 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, v= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








EPGPGEKRACCISL 


6498 


2636 


272 


SLRLCP WGTHLAGPTTMRLSSLLALLRPALPL I LGLS LGCSLSL 
LRVSWIQGEGEDPCVEAVGERGGPQNPDSRARLDQSDEDFKPRI 
VPYYRDPNKPYKKVLRTRYIQTELGSRERLLVAVLTSRATLSTL 
AVAVNRT VAHHF PRLLYFTGQRGARAPAGMQVVS HGDER P AWLM 
S ETLRHLHTH FG AD YDWFF IMQDDT YVQAPRLAALAGHLS I NQD 
Jb Y iAjKAbb F I G AGEQAR YCHGGFG YLLSRS LLLRLRPHLDGCRG 
D I LS ARPDE WLG RCL I DS LGVGC VS QHQGQQ YRS FELAKNRDPE 
KEGS SAFLS AFAVHPVS EGTLMYRLHKRFSALELERAYS E I EQL 
QAQIRNLTVLTPEGEAGLSWPVGLPAPFTPHSRFEVLGWDYFTE 
QHTFSCADGAPKCPLQGASRADVGDALETALEQLNRRYQPRLRF 
QKQRLLNGYRRFDPARGMEYTLDLLLECVTQRGHRRALARRVSL 
LRPLS RVJS I LPMP YVTEATRVQLVLPLLVAEAAAAPAFLEAFAA 
NVLEPREHALLTLLLVYGPREGGRGAPDPFLGVKAAAAELERRY 
PGTRLAWLAVRAEAPSQVRLMDWSKKHPVDTLFFLTTVWTRPG 
PEVLNRCRMNAI SGWQAFFP VHFQEFNPALS PQRSPPGPPGAGP 
DPPSPPGADPSRGAPIGGRFDRQASAEGCFYNADYLAARARIiAG 
ELAGQEEEEALEGLEVMDVFLRFSGLHLFRAVEPGLVQKFSLRD 
CSPRLS EELYHRCRLSNLEGLGGRAQLAMALFEQEQANST 


6499 


3 


2040 


SCSADTRPSGQAWPTVGLRAAAGAFRTGSPLALGPETPQVACLP 
GHPPVRPQVSGGPGAMPDPAAHLPFFYGSISRAEAEEHLKLAGM 
ADGLFLLRQCLRSLGGYVLSLVHDVRFHHFPIERQLNGTYAIAG 
GKAHCG P AEL CE F YS RDPDGL PCNLRKP CNR P SG LEPQ PG VFDC 
LRDAMVRDYVRQTWKLEGEALEQAI ISQAPQVEKLIATTAHERM 
PWYHS SI/EREEAERKLYSGAQTDG KFLLRPRKEQGTYALS LIYG 
KTVYHYLISQDKAGKYC I PEGTKFDTLWQLVE YLKLKADGL I YC 
LKEACPNSSASNASGAAAPTLPAHPSTLTHPQRRIDTLNSDGYT 
PEPARITSPDKPRPMPMDTSVYESPYSDPEELKDKKLFLKRDNL 
L IAD I E LGCGNFGS VRQGVYRMRKKQ I DVAI KVLKQGTEKADT E 
EMMREAQIP4HQLDNPYIVRLIGVCQAEALMLVMEMAGGGPLHKF 

VNRHYAK I SDFGLS KALGADDS YYTARSAGKV^ LKWYAPEC INF 
RKFS S RS DVWSYGVTM WEALS YGQKP YKKMKG PE VMAFI EQGKR 
MECP P E CP P EL YALMS DCW I YKWEDRPD FLT VEQRMRAC Y YS LA 
SKVEGPPGSTQKAEAACA 


6500 


1773 


726 


MLSESS S FLKGVMLGS IFCAL I TMLGH IR IGHGNRMHHHEHHHL 
QAPNKED I LKI SEDERMELSKS FRVYCI I LVKPKDVSLWAAVKE 
TWTKHCDKAEFFSS ENVKVFES INMDTNDMWLMMRKAYKYAFDK 
YRDQYNWFFLARPTTFAIIENLKYFLLKKDPSQPFYLGHTIKSG 
DLEYVGMEGGIVLSVESMKRLNSLLNIPEKCPEQGGMIWKISED 
KQLAVCLKYAGVFAENAEDADGKDVFNTKS VGLS I KEAMT YHPN 
QWEGCCSDMAVTFNGLTPNQMHVMMYGVYRLRAFGPYFQ 


6S01 


1 


570 


LVGMS GGGT ET P VGCEAAPGGG S KKRDSLGTAGS AHL I IKDLGE 
IHSRLLDHRPVIQGETRYFVKEFEEKRGLREMRVLENLKNMIHE 
TNEHTL P KCRDTMRDS LS QVLQRLQAANDS VCRLQQREQERKKI 
HSDHLVASEKQHMLQWDNFMKEQPNKRAEVDEEHRKAMERLKEQ 
YAEME KDLAKFSTF 


6502 


213 


1650 


AGNKPDP WAGRNRTAVLPDVS VFHRED VGWWRSWLQQS YQAVKE 
KSS EALE FMKRDLTE FTQ WQHDTACT I AATAS WKE KLATEGS 
SGATEKMKKGLSDFLGVISDTFAPSPDKTIDCDVITLMGTPSGT 
AE P YDGT KARL YS LQS DPAT YCNE PDG P P E L FDAWLS QFCLEE K 
KGEISELLVGSPSIRALYTKMVPAAVSHSEFWHRYFYKVHQLEQ. 
EQARRDAL KQRAEQS I SEEPG WEEEEEELMG I S PI S PKEAKVP V 
AKISTFPEGEPGPQSPCEENLVTSVEPPAEVTPSESSESISLVT 
QIANPATAPEARVLPKDLSQKLLEASLEEQGLAVDVGETGPSPP 
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location 
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amino acid 
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Predicted end 
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location 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R^Arginine, 
S^Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 








IHSKPLTPAGHTGGPEPRPPARVETLREEAPTDLRVFELNSDSG 
KSTPSNNGKKG^^TnT^FDWFKnvnT.nMTFRTiTU'nMZiT cinmacp 

a * fc-JA'*-' w iVJWjkj ijiux JuL/nu i\U r 1 ' 1 II 1 l T l X Ei Et & V \4 l Y Lr\ Lj o ]\V 1 ino \j 

EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


6503 


213 


1650 


AGNKPDPWAGRNRTAVLPDVSVFHREDVGWWRSWLQQSYQAVKE 
KS S SAL E FMKRDLTE FTQ WQHDTACT I AATAS WKE KLATEGS 
SGATEKMKKGLSDFLGVISDTFAPSPDKTIDCDVITLMGTPSGT 
AE P YDGT KARLYS LQSD PAT YCNE PDG P PEL FDAWLS Q FCLEE K 
KGE I SELLVGS PS IRAL YTKMVPAAVSHSEFWHRYF YKVHQLEQ 

EOARRDALKORAEOS T SEEPGWPFT?T5'P , PT.Mf5T «?PT QDVPZUfX/DTr 

LiymuwrLM(\ynnCjwg X 0 OOC*jnEiEiEin»aEil4l*l\3X Oir J-OjtrlvILAlVVxr V 

AKISTFPEGEPGPQSPCEENLVTSVEPPAEVTPSESSESISLVT 
QIANPATAPEARVLPKDLSQKLLEASLEEQGLAVDVGETGPSPP 
IHSKPLTPAGHTGGPEPRPPARVETLREEAPTDLRVFELNSDSG 
KSTPSNNGKKGS STD I S ED WEKDFDLDMTEEEVQMALSKVDASG 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


6504 


2131 


1294 


GKVCLVAHWVCLSILS PPPAGMKTPNAQEAEGQQTRAAAGRATG 
oHvtn l JSJ\i\V rUxUKtsKirboUrCKJNl VCsCKIfaHtiWKEGDEPITQ 
WKGTVLDQVPINPSLYLVKYDGIDCVYGLELHRDERVLSLKILS 
DRVAS SH 1 S DANLANT 1 I GKAVEHMFEGEHGS KDE WRGMVLAQA 
P I MKAWF Y I T YE KD P VL YM YQLLDD YKEGDLR IMPESSESP PTE 
REPGGWEGLIGKHVEYTKEDGSKRIGMVIHQVEAKPSVYFIKF 
DDDFH I YVYDLVKKS 


*505 


2131 


1294 


GKVCLVAHWVCLSILS PPPAGMKTPNAQEAEGQQTRAAAGRATG 
SANMTKKKVSQKKQRGRPS SQPCRNI VGCR I SHGWKEGDE P I TQ 
WKGTVLDQVPINPSLYLVKYDGIDCVYGLELHRDERVLSLKILS 
DRVAS SHISDANLANTI IGKAVEHMFEGEHGS KDEWRGMVLAQA 

PIMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSESPPTE 

DP 1 DnflT ATTVT TPVinTTPVTVPriPevo tput rxurw rc 7\ trnmivrTvn 
icCiirvjij v viAsi-i -LIjAjivc. i J. i\Ji<JJijoJ\Klt3MvXfiy VKAKPS VY FIKF 

DDD FH I YVYDLVKKS 


6506 


1 


1350 


EVSPPTSCCLTVAVADPGVSEGFRGFGAGCEMPGRGRCPDCGST 
ELVBDSHYSQSQLVCSDCGCWTEGVLTTTFSDEGNLREVTYSR 
STGENEQVSRSQQRGLRRVRDLCRVLQLPPTFEDTAVAYYQQAY 
RHSG I RAARLQKKE VL VGC C VL I TCRQHNWPLTMGA I CTLL YAD 
LDVFS ST YMQ I VKLLGLD VPSLCLiAELVKTYCS S FKLFQAS PSV 
PAKYVEDKEKMLSRTMQLVELANETWLVTGRHPLPVITAATFLA 

EQLAWLRVLRLDKRSWKHIGDLLQHRQSLVRSAFRDGTAEVET 
REKEPPGWGQGQGEGSVGNNSLGLPQGKRPASPALLLPPCMLKS 
PKRICPVPPVSTVTGDENISDSEIEQYLRTPQEVRDFQRAQAAR 
QAATSVPNPP 


6507 


1878 


929 


RSHASRLPELPSGCLVLQVQELVQMSGMEATVT I P I WQNKPHGA 
ARSVVRRIGTNLPLKPCARASFETLPNISDLCLRDVPPVPTLAD 
I AW I AADEE E TYAR VRS DTR PLRHTWKP S PL I VMQRNAS VPNLR 
GS EERLLALKKPAL P ALSRTTELQDELS HLRS Q I AKI VAADAAS 
AS LTPD FLS PGSSNVS SPLPCFGSS FHS TTS F VI S D I TE ETE VE 
VPELPSVPLLCSASPECCKPEHKAACSSSEEDDCVSLSKASSFA 
DMMGILKDFHRMKQSQDLNRS LLKEEDPAVL I SEVLRRKFALKE 
EDISRKGN 


6508 


862 


342 


WEARKRPQRWPSERREVRVPPPHLQRGRSGLEPGTFRKMAAARP 
SLGRVLPGSSVLFLCDMQEKFRHNIAYFPQIVSVAARMLKNTTL 
DLLDRGl^VHVVVDACSSRSQVDRLVALARMRQSGAFLSTSEGL 
ILQLVGDAVHPQFKE I QKLI KE PAPDSGLLGLFQGQNSLLH 


6509 


2 


1053 


fvwnprggrkrrrqaavtqaatrasgtpsprdgtmtOgklsvan 

KAPGTEGQQQVHGEKKEAPAVPSAPPSYEEATSGEGMKAGAFPP 

aptavplhpswayvdpsssssydngfptgdhelfttfswddqkv 
rrvfvrkvytilliqllvtlavvalftfcdpvkdyvqanpgwyw 
asyavffatyltlaccsgprrhfpwnlilltvftlsmayltgml 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /=poesible nucleotide deletion, 
\«poseible nucleotide insertion) 








S S Y YNTTS VLLCLG I TALVCLS VTV FS FQT KFDFTS CQG VL F VL 
LMTLFFSGL I LAILLPFQYVPWLHAVYAALGAGVFTLFLALDTQ 
LLMGNRRHSLSPEEYIFGALNIYLDIIYIFTFFLQLFGTNRE 


6510 


| 37 


1156 


PCALDGCPQRGAVHPLLSSAMGLIAFLKTQFVLHLLVGFVFWS 
GLVINFVQLCTLALWPVSKQLYRRLNCRLAYSLWSQLVMLLEWW 
SCTECTL FTDQATVE RFGKEHAVI I LNHNFE I DFL CGWTMCERF 
GVLGSSKVLAKKELLYVPLIGWTWYFLEIVFCKRKWEEDRDTW 
EGLRRLSDYPEYMWFLLYCEGTRFTETKHRVSMEVAAAKGLPVL 
KYHLLPRT KG FTTAVKCLRGTVAA VYDVTLN FRGNKNP S LLGIL 
YGKKYEADMCVRRFPLEDIPLDEKEAAQWLHKLYQEKDALQEIY 
NQKGMFPGEQFKPARRPWTLLNFLSWATILLSPLFSFVLGVFAS 
GSPLLILTFLGFVGAGNGHCR 


6511 


2541 


1425 


GEEQPLAAAPTECLEQVIGGAGDPGTWASFPSPLPGPAPLKGGK 
TMATNFSDIVKQGYVKMKSRKLGIYRRCWLVFRKSSSKGPQRLE 
KYPDEKSVCLRGCPKVTEISNVKCVTRLPKETKRQAVAIIFTDD 
SARTFTCDS ELEAEEW YKTLS VECLGSRLNDI SLGEPDLLAPG V 
QCEQTDRFNVFLLPCPNLDVYGECKLQITHENIYLWDIHNPRVK 
LVS WPLCSLRR YGRDATR FTFEAGRMCDAGEGL YTFQTQEGEQ I 
YQRVHSATLA I AEQHKR VLLEME KNVRLLNKGTEHYS YP CT PTT 
MLPRSAYWHH1TGSQNIAEASSYAGEGYGAAQASSETDLLNRFI 
LLKPKPSQGDSSEAKTPSQ 


6512 


159 


807 


FGKKSTWFPLSRS LR VAS GRSC KLGHGG YTG SG PGFGE PRDS GA ' 

EVPSGSGRATGCERGGVRGARQGRAPGSSIWRKEPRMVCTRKTK 

TLVSTCVILSGMTNIICLLYVGWVTNYIASVYVRGQEPAPDKKL 

EEDKGDTLKIIERLDHLENVIKQHIQEAPAKPEEAEAEPFTDSS 

LFAHWGQELSPEGRRVALKQFQYYGYNAYLSDRLPLDRP 


6513 


2 


756 


FVS PE PGFS LAQLNL I WQLTDTKQLVHSFAEGQDQGSAYANRrA 
LFPDLIiAQGNASLRLQRVRVADEGS FTCFVS I RDFGSAAVS LQ V 
AAPYSKPSMTLEPNKDLRPGDTVTITCSSYQGYPEAEVFWQDGQ 
GVPLTGNVTTSQMANEQGLFDVHS I LRWLGANGT YSCLVRNPV 
LQQDAHSSVTITPQRSPTGAVEVQVPEDPWALVGTDATLRCSF 
SPE PGFS LAQLNL I WQLTDTKQLVHS FAEGQDQGSAYANRTALF 
PDLLAQGNAS LRLQRVR VADEGS FTCFVS IRDFGSAAVSLQVAA 
PYS KP S MTLE PNKDLRPGDT VTITC S S YQG Y PEAE VFWQDGQG V 
PLTGNVTTSQMANEQGLFDVHS ILRWLGANGTYS CLVRNPVLQ 
QDAHS S VT I TPQRS PTGAVEVQ VPEDP WALVGTDATLRCS FS P 
E PGFS LAQLNLI WQ LTDTRQLVHS FTEGR 


6514 


985 


302 


VGI PG PT I S S AAEMEDLLDLDE ELR YSLATSRAKMGRRAQQESA 
QAENHLNG KNSS LTLTGETS S AKL PRCRQGGWAGDS VKAS KFRR 
KASEE IEDFRLRPQSLNGSDYGGDI PI I PDLEE VQE EDFVLQVA 
APPS I Q I KRVMT YRDLDNDLMKYSAIQTLDGE I DLKLLTKVLAP 
EHEVRERNPSWQDDVGWDWDHLFTEVSSEVLTEWDPLQTEKED? 
AGQARHT 


6515 


1345 


305 


GRVGSRRRGAAVPGGCGAGSTQLEVSASASCGALGSADMNPIW 
VHGGGAG P I S KDRKER VHQGMVRAAT VG YGI LREGG S AVDAVEG 
A WALEDD PE FNAGCGS VLNTNGEVEMDAS I MDGKDLS AGAVS A 
VQCIANP I KLARLVMEKTPHCFLTDQGAAQFAAAMGVPE I PGE K 
LVTERNKKRLEKEKHEKGAQKTDCQKNLGTVGAVALDCKGNVAY 
ATS TGG I VNKMVGR VGDS P CLGAGG YADND IGAVS TTGHGES I L 
KVNLARLTLFHIEQGKTVBEAADLSLGYMKSRVKGLGGLIVVSK 
TGDWYAKWTSTSMPWAAAKDGKLHFG I DPDDTTITDL P 


6516 


1 


1402 


FRRLRYLGQDATAAARDLRTRGLQGYCPSATARQQVLVSALQQL 
KGRRSEHRNENQEMPYSTNKELILGIMVGTAGISLLLLWYHKVR 
KPG I AMKL P E FLS LGNTFNS I TLQDE IHDDQGTTV I FQERQLQI 
LE KLNELLTNMEBLKE E I RFLKEAI P KLEE Y I QDELGGK I T VHK 
I S PQHRARKRRLPTI Q S S ATSNS S EEAESEGG YITANTDTEEQS 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine / C«Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glut amine, FUArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








FP VP KAFNTRVE E LNLD VLLQKVDHLRM SE SG KS E S FELLRDH K 
EKFRDEIEFMWRFARAYGDMYELSTNTQEKKHYANIGKTLSERA 
I NRAPMNGHCHLW YAVLCG YVS EFBGLQNKINYGH L FKEHLD I A 
IPCLLPEEPFLYYLKGRYCYTVSKLSWIEKKMAATLFGKIPSSTV 
QEALHNFLKAEBLCPGYSNPNYMYLAKCYTDLEENQNALKFCNL 
ALLLPTVTKEDKEAQKEMQKIMTSLKR 


6517 


3 


1414 


GRVWGGS S S LNAMVYVRGHAED YERWQRQGARGWDYAHCLP YFR 
KAQGHELGASRYRGADGPLRVSRGKTNHPLHCAFLEATQQAGYP 
LTEDMNGFQQEGFGWMDMTIHEGKRWSAACAYLHPALSRTNLKA 
EAE XL VS RVL FEGTRAVGVE YVKNGQ S HRA YAS KE VI LSGGAIN 
S PQL LM LS G I GNADDLKKLG I P WCHL PG VGQNLQDHLE I Y I QQ 
ACTRPI TLHS AQKPLRKVC IGLE WLWKFTGEGATAHLETGGF IR 
SQPGVPHPDIQFHFLPSQVIDHGRVPTQQEAYQVHVGPMRGTSV 
GWLKLRSANPQDHPVIQPNYLSTETDIEDFRLCVKLTREIFAQE 
ALAPFRGKELQPGSHIQSDKEIDAFVRAKADSAYHPSCTCKMGQ 
P SD PTAWDP QTRVLGVENLR WDAS I M PS MVSGNLNAP TI MIA 
EKAADI I KGQPALWDKD VPVYKPRTLATQR 


6518 


242 


1098 


PAWNPGSEPRTRVRPRARSFPLPPPRAPRRRRHRLLRAVPGPSR 
RHRCRRRAP P PPSTMGDAGS ERS KAPS LPPR CPOGFWGS S KTMN 

LCSKCFADFQKKQPDDDSAPSTSNSQSDLFSEBTTSDNNNTSIT 
TPTLSPSQQPLPTELNVTSPSKEECGPCTDTAHVST,TTPTTfPc:r 
GTDS QS ENEAS P VKRPRLLENTERS EETS RS KQKS RRRCFQCQT 
KLELVQQELGSCRCGYVFCMLHRLPEQHDCTFDHMGRGREEAIM 
KMVKLDRKVGRS CQR I GEG CS 


6519 


3 


1113 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQLSSRDPPGSLS 
AKKVRTEEXKAPRRVNGEGGSGGNS RQLQPPAAPS PQS YGS PAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLL 
VPPTLLHAQPHIIIiLLPAAAAAASANAKSRRPKEKREKERRRHGL 
GGAREAGGASREENGEVKPLPRDKIKDKIKERDKEKERBKKKHK 
VMNE I KKENGE VKI LL KSG KEKP KTN I EDLQ I KKVKKKKKKKHK 
ENEKRKRPKMYS KS IQT I CSGLLTDVEDQAAKG I LNDNI KD YVG 
KNLDTKNYDSKIPENSEFPFVSLKEPRVQNNLKRLDTLEFKQLI 
HIEHQ PNGGAS VrHCLQ 


6520 


3 


1113 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQLSSRDPPGSLS' 
AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPSPQSYGSPAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLL 
VPPTLIiHAQPHHLLLPAAAAAASANAKSRRPKEKREKERRRHGL 
GGAREAGGASREENGEVKPLPRDKI KDKI KERDKEKEREKKKHK 
VMNE I KKENGEVKILLKSGKEKPKTNI EDLQ I KKVKKKKKKKHK 
ENEKRKRPKMYS KSIQTI CSGLLTDVEDQAAKG I LNDN I KD YVG 
KNLDTKNYDSKIPENSEFPFVSLKEPRVQNNLKRLDTLEFKQLI 
H I EHQ PNGGAS VI HCLQ 


6521 
6522 


184 

1042 


1798 
391 


KLFKMATDTSQGELVHPKALPLIVGAQLIHADKLGEKVEDSTMP 
IRRTVNSTRETPPKSKLAEGEEEKPEPDISSEESVSTVEEQENE 
TPPATSSEAEQPKGEPENBEKEENKSSEETKKDEKDQSKEKEKK 
VKKTIPSWATLSASQLARAQKQTPMASSPRPKMDAILTEAIKAC 
FQKSGAS WAIRKYI IHKYPSLELERRGYLLKQALKRELNRGVI 
KQVKGKGASGSFVWQKSRKTPQKSRNRKNRSSAVDPEPQVKLE 
DVL P LAFTRLCEP KEAS YS L IRKYVS Q Y YPKLR VD I R PQLLKNA 
LQRAVERGQLEQITGKGASGTFQLKKSGEKPLLGGSLMEYAILS 
AIAAMNEPKTCSTTALKKYVLENHPGTNSNYQMHLLKKTLQKCE 
KNGWMEQISGKGFSGTFQLCFPYYPSPGVLFPKKEPDDSRDEDE 
DEDESSEEDSEDEEPPPKRRLQKKTPAKSPGKAASVKQRGSKPA 
P KVSAAQRGKARPL P KKAP P KAKTPAKKTRP S STV I KKP S GGSS 
KKPATSARKE 

NKWLRPSPRSHRTPESGRVLSLFRLPPPGMALSGSTPAPCWEED 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=* Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECLDYYGMLSLHRMFEVVGGQLTECELELLAFLLDEAPGAAGGL " 
SRARSGLKLLLELERRGQCDESNLRLLGQLLRVLARHDLLPHLA 
RKRRRPVSPERYSYGTSSSSKRTEGSCRRRRQSSSSANSQQGSP 
PTKRQRRSRGRPSGGARRRRRGPQPHPSSSQSPPDLPLKAK 


6523 


2 


1097 


AS CQTRRR TAALDS GERI AGRRS P IALAMASNFND I VKQG YVKI 
RSR KLG I FRRCWL VFKKAS S KG PRRLE KFPDEKAAYFRNFHKVT 
ELHNIKNI TRLPRETKKHAVAI I FHDETSKT FACES ELEAEEWC 
KHLCME CLGTRLND I S LGE P DLLAAGVQREQNER FNVYLMPTPN 
LD I YGE CTMQ I THEN I YLWD IHNAKVKL VM WPLS S LRR YGRDS T 
WFTFESGRMCDTGEGLFTFQTREGEMIYQKVHSATLAIAEQHER 
LMLEMEQKARLQTSLTEPMTLSKS I SLPRSAYWHH I TRQNSVGE 
IYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE 


*524 


2 


1097 


AS CQTRRRTAALDSGERI AGRRS P I ALAMAS NFND I VKQG YVK I 
RSRKLGIFRRCWLVFKKASSKGPRRLEKFPDEKAAYFRNFHKVT 
ELHNIKNITRLPRETKKHAVAI I FHDETSKTFACESELEAEEWC 
KHLCMECLGTRLNDISLGEPDLLAAGVQREQNERFNVYLMPTPN 
LDI YGECTMQI THEN I YLWD IHNAKVKLVMWPLSSLRRYGRDST 
WFTFESGRMCDTGEGLFTFQTREGEMIYQKVHSATLAIAEQHER 
LMLEMEQKARLQTS LTE PMTLS KS I SLPRSAYWHH I TRQNSVGE 
IYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE 


6525 


1 


1859 


GESPFSEEESIEFNPSSSGRSARTVSSNSFCSDDTGWPSSQSVS 
PVKTP S DAGNS P IGFCPGSDEGFTRKKCT I GMVGEGS IQS SRYK 
KESKSGLVKPGSEADFSSSSSTGSISAPEVHMSTAGSKRSSSSR 
NRGPHGRSNGASSHKPGSSPSSPREKDLLSMLCRNQLSPVNIHP 
SYAPSSPSSSNSGSYKGSDCSPIMRRSGRYMSCGENHGVRPPNP 
EQ YLT PLQQKE VT VRHLKTKLKES ERRLHERE S E I VE LKS QLAR 
MREDWIEEECHRVEAQLALKEARKE I KQLKQVI ETMRS S LADKD 
KG I QKYFVD IN I QNKKLES LLQS MEMAHSGS LRDELCLDFPCDS 
PE KS LTLNPP LDTMADGLS LEEQ VTGEGADR ELL VGDS I ANSTD 
LFDE I VTATTTESGDLELVHSTPGANVLELLP I VMGQEEGS VW 
ERAVQTDWPYS PAIS ELI QSVLQKLQDPCPSS LAS PDESEPDS 
MES FPESLSALWDLTPRNPNSAILLS PVETPYANVDAEVHANR 
LMRELD FAACVE ERLDGVI PLARGGWRQYWSS S FLVDLLAVAA 
PWPTVLWAFSTQRGGTDPVYNIGALLRGCCWALHSLRRTAFR 
IKT 


6526 


2 


2034 


SGRAGEPEEWRGRQ 1 1 DS KETWI P FNSEDSQQLEEAYSSGKG CN 
GRVVPTDGGRYDVHLGERMRYAVYWDELASEVRRCTWFYKGDKD 
NKYVPYSESFSQVljEETYMLAVTLDEWKKKLESPNREIIILHNP 
KLMVHYQPVAGSDDWGSTPMEQGRPRTVKRGVEN I S VDI HCGE P 
LQI DHLVFWHG IGPACDLRFRS I VQCVNDFRS VSLNLLQTHFK 
KAQENQQIGRVEFLPVNWHS PLHSTGVDVDLQRITLPS INRLRH 
FTNDT I LDVFF YNS PTYCQTI VDTVAS EMNR I YTLFLQRNPDFK 
GGVS I AGHSLGSLI LFDILTNQKDSLGDIDSEKGSLNIVMDQGD 
TPTLEE DLKKLQLS EFFD I FE KEKVDKEALALCTDRDLQE IG I P 
LGPRKKI LNYFS TRKNSMG I KRPAPQPASGANI PKESEFCSSSN 
TRNGD YLDVGIGQVS VKYPRL I YKPE I FFAFGS P IGMFLTVRGL 
KRIDPNYRFPTCKG FFNI YHPFDPVAYR IEPMWPGVEFE PML I 
PHHKGR KRMHLE LREGLTRMS MDLKNNLLGSLRMAW KS FTRAP Y 
PALQASETPEETEAEPESTSEKPSDVNTEETSVAVKEEVLPINV 
GMLNGGQRIDYVLQEKPIESFNEYLFALQSHLCYWESEDTVLLV 
LKEIYQTQGIFLDQPLQ 


6527 


1 


922 


G WVPLLS R I LPS DACKI YKQG I NI RLDTT LI DFTDM KCQRGDLS 
F I FNG DAAPSES FWLDNE QKVYQRI HHE ESEMETEEE VD I LMS 
SDIYSATLSTKS IS FTRAQTGWLFREDKTERVGNFLADFYLVNG 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G»Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








L VLE S RKRREHLS EED I LRNKAI ME S LS KGGN IMEQN FE P I RRQ 
S LTP PPQNT I TWE E Y I S AENGKAPHLGRELVCKE S KKT FKATI A 
MSQEFPLGIELLLNVLEWAPFKHFNKIjREFVQMKLPPGFPVKL 
D I P VF P T I TAT VTFQE FRYDE FDGS I FT I PDD YKEDPSR F P DL 


6528 


1 


1073 


LTGPAAAEPRCAADAGMKRAIX3RRKGWLRLRKILFCVLGLYIA 
I PPL I KLCPG IQAKLI FLNFVRVP YFIDLKKPQDQGLNHTCNYY 
I^PEEDVTIGWHTVPAVWWKNAQGBCDQMWYEDALASSHPIILY 
LHGNAGTRGGDHRVELYKVLSSLGYHWTFDYRGWGDSVGTPSE 
RGMTYDALHVFDW I KARSGDNP VYI WGHS LGTGVATNLVRRLCE 
RETPPDALILESPFTNIREEAKSHPFSVIYRYFPGFDWFFLDPI 
TSSGI KFANDENVKHISCPLLILHAEDDPWPFQLGRKLYS IAA 
PARS FRDFKVQFVP FHS DLG YRHKY I YKS PEL PR I LRE FLGKS E 
PEHQH 


6529 


363 


2215 


TH I R YNKIG WKTMS CGNE FVETL KK I G Y PKADNLNGEDFD WLF 
EGVEDES FLKWFCGNVNEQNVLSERELEAFS I LQKSGKP I LEGA 
/\iiUiSAijJ\.i\.KT. bULiivl FKJ^DKEIjEKIjEDEVQTIjIjKLI^LKIQR 
RNKCQLMASVTS HKS LRLNAKEEEAT KKLKQS QG I LNAM I TKI S 
NELQALTDEVTQLMMFFRHSNLGQGTNPLVFLSQFSUEKYLSQE 
EQSTAALTLYTKKQFFQGIHEWESSNESQFFNFLKIQTPSICD 
NQEILEERRLEMARLQLAYICAQHQLIHLKASNSSMKSSIKWAE 
ESLHSLTSKAVDKENLDAKI SSLTSE I MKLEKEVTQ I KDRSLPA 
WRENAQLLNMP WKGDFDLQ I AKQD YYTARQELVLNQLI KQKA 
SFELLQLSYEIELRKHRDIYRQLENLVQELSQSNMMLYKQLEML 
TDPS VSQQINPRNTI DTKDYS THRLYQ VLEGENKKKELFLTHGN 
LEE VAE KLKQNI SLVQDQLAVS AQEHS FFLS KRNKD VDM LCDTL 
YQGGNQLLLSDQELTEQFHKVESQLNKLNHLLTDILADVKTKRK 
TLANNKLHQMEREFYVYFLKDEDYLKD I VENLETQS KI KAVS LE 
D 


6530 


128 


2986 


GAAHHGAI VQVHPLL PGS S T I M I HDLCL VF P AP AKAWYVS D I Q 
ELYIRVVDKVEIGKTVKAYVRVLDLHKKPFLAKYFPFMDLKLRA 
AS P 1 1 TL VALDE ALDNYT I TFL I RGVA I GQTSLTAS VTNKAGQR 
INSAPQQIEVFPPFRLMPRKVTLLIGATMQVTSEGGPQPQSNIL 
FSISNESVALVSAAGLVQGLAIGNGTVSGLVQAVDAETGKWI I 
SQDLVQVEVLLLRAVRIRAPIMRMRTGTQMPIYVTGITNHQNPF 
S FGNAVPGLTFHWS VTKRDVLDLRGRHHE AS IRL P S Q YNFAMNV 
LGRVKGRTGLRAWKAVDPTSGQLYGLARELSDEIQVQVFEKLQ 
LLNPE I EAEQILMS PNS YI KLQTNRDGAASLS YRVLDGPEKVPV 

\7H\7nT?Tf/^lT7T.2VCrtQMTn'T 1 Cn , TT?tTTlVOPDT7r , aKr/ , vrT T\T^\TV\7C n\ro 
v n v jjej rvur unouoni vj J. o l ±Ei v L/\\^ctc r unNul J. X VAvlxVorvb 

YLRVSMS PVLHTQNKEALVAVPLGMTVTFTVHFHDNSGDVFHAH 
S S VLNFATNRDD FVQ IGKGPTNNTCWRTVS VGLTLLR VWDAKH 
PGLSDFMPLPVLQAI S PELSGAMWGDVLCLATVLTSLEGLSGT 
WSSSANSILHIDPKTGVAVARAVGSVTVYYEVAGHLRTYKEVW 

S VPOR I MAR HT iH P I OT ^ FOR AT AS K\I T VA VfinP QQWT.P f2T7 PTDT 
u v tr yt\ -L rjj-v.rv.ri untr iu x oryaniAuav J. VAVuLHUOiiiiKuEil., Jtri 

QREVIQALHPETLISCQSQFKPAVFDFPSQDVFTVEPQFDTALG 
QYFCSITMHRLTDKQRKHLSMKKTALWSASLSSSHFSTEQVGA 
EVPFSPGLFADQAEILLSNHYTSSEIRVFGAPEVLENLEVKSGS 
PAVLAFAKEKSFGWPSFITYTVGVLDPAAGSQGPLSTTLTFSSP 
VTNQAIAI PVTVAFWDRRGPGPYGASLFQHFLDS YQVMFFTLF 
ALLAGTAVMI IAYHTVCTPRDLAVPAALTPRASPGHS PHYFAAS 
SPTSPNALPPARKASPPSGLWSPAYASH 


6531 


845 


1425 


PSASIPPSASPDPVPDIRTCHFCLVEDPSVGCISGSEKCTISSS " 
S LCMV I T I YYDVKVR EI VRG CGQ Y I S YRCQE KRNT YFAE YWYQA 
QCCQYDYCNSWSSPQLQSSLPEPHDRPLALPLSDSQIQWFYQAL 
NLSLPLPNFHAGTEPDGLDPMVTLSLNLGIiSFAELRRMYLFLNS 
S GLLVL PQAGLLTPHP S 


6532 


2 


954 


AAGPPSEWNQDSLFPEPEPGPAPQVLLGPQGPGLIKGVAPPTL 
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amino acid 
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amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, I=:Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /r=possible nucleotide deletion, 
\»possible nucleotide insertion) 








ITDSTGTHLVLTVTNKNAHSPGLSRGSPQQPSSQPGSPAPAPSA 
QMDLEHPLQPLFGTPTSLLKKEPPGYEEAMSQQPKQQENGSSSQ 
QMDDLFDILIQSGEISADFKEPPSLPGKEKPSPKTVCWSPLAAQ 
PSPSAELPQAAPPPPGSPSLPGRLEDFLESSTGLPLLTSGHDGP 
EPLSLIDDLKSQMLSSTAILDHPPSPMDTSELHFVPEPSSTMGL 
DLADGH LD SMD W LE LS SGG P VLSLAP LS TTAPS L FS TD FLDGHD 
LQLHWDSCL 


6533 


1798 


373 


STISWLARVBPPRRSSGVGAARLRFPGGSRPLRARACVLAIAVL 
ALLERNNADS MS AHS MLCER I AI AKEL I KRAES LS RS RKGG I EG 
GAKLCSKLKAELKFLQKVEAGKVAIKESHLQSTNLTHLRAIVES 
AENLEEWSVLHVFGYTDTLGEKQTLWDWANGGHTWVKAIGR 
KAEALHNI WLGRGQ YGDKS 1 1 EQAEDFLQASHQQPVQYSNPH 1 1 
FAFYNS VSS PMAE KLKEMG I S VRGDI VAVNALLDHPEELQPSES 
ESDDEGPELLQVTRVDRENI LASVAFPTE I KVDVCKR VNLD I TT 
L I TYVS ALS YGGCH F I FKE KVLTEQAE QERKEQ VLPQLE AFMKD 
KELFACESAVKDFQSILDTLGGPGERERATVLIKRINWPDQPS 
ERALRLVASSKINSRSLTIFGTGDTLKAITMTANSGFVRAANNQ 
GVKFS VF I HQPRALTESKEAliATPLPKDYTTDSEH 


6534 


47 


596 


KATRF I SAAFWLNKQGVS PAKLPHTS WS WSLQTLS FLFSGDLA 
EKSLQCFPCSAMLLELIPLLGIHFVLRTARAQSVTQPDIHITVS 
EGASLELRCNYSYGATPYLFWMERTVEEAFILLVCLKPWRVASS 
LE KKE KE DES FQLIiLGSRYNVLKAHC LLP I»I RWLTS GD S LLS AQ 
PHCPQGL 


6535 


250 


964 


LIKTFFRDVAIQRDLLPKEKNLETLLTLAFLEIDKAFSSHARLS " 
ADATLLTS GTTATVALLRDG I EL WASVGDSRAI LCRKGKPMKL 
TIDHTPERKDEKERI KKCGGFVAWNSLGQPHVNGRLAMTRS IGD 
LDLKTSG VIAE PETKRIKLHHADDS FLVLTTDG INFMVNSQE I W 
DFVNQCHD PNEAAHAVTEQAI QYGTEDNSTAVWPFGAWGKYKN 
SEINFSFSRSFASSGRWA 


6536 


242 


1174 


SLVKEMTNQYGILFKQEQAHDDAIWSVAWGTNKKENSETWTGS " 
LDDLVKVWKWRDERLDLQWSLEGHQLGWSVDISHTLPIAASSS 
LDAHI RL W DLENGKQ I KS I DAGP VDAWTLAF S PD S Q YLATGTHV 
GKVNI FGVE S GKKE YS LDTRG KF I LS I AYS PDGKYLAS GA I DG I 
INI FD I ATGKLLHTLEGHAMP IRS LTFS PDSQLLVTASDDGYI K 
IYDVQHANLAGTLSGHASWVLNVAFCPDDTHFVSSSSDKSVKVW 
DVGTRTCVHT FFDHQDQVWGVKYNGNGS KI VS VGDDQE I HI YDC 
PI 


6537 


1638 


921 


NRFNPPPTQGPDPSLVYRPDVDPEVAKDKASFRNYTSGPLLDRV 
FTTYKLMHTHQTVDFWSKHAQFGGFSYKKMTVMEAVDLLDGLV 
DESDPDVDFPNS FHAFQTAEG I RKAHPD KDW FHLVGLLHDLGKV 
LALFGEPQWAWGDTFPVGCRPQASWFCDSTFQDNPDLQDPRY 
STELGMYQPHCGLDRVLMSWGHDGEARGGQWGGGGRWGTVGGGG 
AEAVPAGDTLSPQSTCTR 


6538 


3345 


2412 


ARHNRDDEAI KKAVNE YDETMBKYI PVLMAQAKI YWNLENYPMV 
EKIFRKSVEFCbnDHDWKLNVAHVLFMQENKYKEAIGFYEPIVK 
KHYDNILNVSAIVLANLCVSYIMTSQNEKAEELMRKIEKEEEQL 
SYDDPNRKMYHLCI VNLVIGTLYCAKGNYEFGI SRVI KSLEPYK 
KKLGTDTWYYAKRCFLSLLENMSKHMIVIHDSVIQECVQFLGHC 
ELYGTNI PAVI EQPLEEERMHVGKNTVTDESRQLKAL I YE I IGW 
NK 


6539 


218 


339 


FLG AAS PHPH FS S LAPHPDQPE FT P VQDELEAMELWG PG V 


5540 


3 


391 


LE RL WLLLLRR P EDAMAE C PTLGEAVTDHPDRLWAWE KF VYLDE 
KQHAWLPLT I E I KDRLQLRVLLRRED WLGRPMTPTQ IGPSLLP 
I M WQL YPDGR YRSSDS S FWRLVYH I KI DGVEDMLLELLPDD 
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Predicted end 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6541 


1165 


536 


RTLVQRRILMLLRKPARGRDLRGRGRGTPRGGRKGLLPTPDEFP 
RFEGGRKPDS WDGNRE PGPGHEHFRDTPR PDHP PHDGHS P AS RE 
RS S S LQGMDMAS L P PR KRP WHDGPGTS EHREMEAPGG P S EDRGG 
KGRGGPGPAQRVPKSGRSSSLDGEHHDGYHRDEPFGGPPGSGTP 
SRGGRSGSNWGRGSNMNSGPPRRGASRGGGRGR 


$542 


3 


3775 


SWPRGRGETGGHPGALRTRTMQKSVRYNEGHALYLAFLARKEGT 
KRGFLSKKTAEASRWHEKWFALYQNVLFYFEGEQSCRPAGMYLL 
EGCSCERTPAPPRAGAGQGGVRDALDKQYYFTVLFGHEGQKPLE 
LRCEEEQDGKEWMEAIHQASYADILIEREVLMQKYIHLVQIVET 
EKIAANQLRHQLEDQDTEIERLKSEIIALNKTKERMRPYQSNQE 
DEDPDIKKIKKVQSFMRGWLCRRKWKTIVQDYICSPHAESMRKR 
NQ I VFTMVEAES E YVHQLYILVNGFLRPLRMAAS 9 KKPP I SHDD 
VSS I FLNSET I MFLHE I FHQGLKARIANWPTL I LADLFD I LLPM 
LNIYQEFVRNHQYSIjQVriANCKQNRDFDKLLKQYEANPACEGRM 
LET FLTY P M FQ I PRY 1 1 TLHELLAHTPHEHVERKS LE FAKS KLE 
ELS RVMHDEVS DTEN I RKNLAI ERM I VEGCD I LLDTSQT F I RQG 
SLIQVPSVERGKLSKVRLGSLSLKKEGERQCFLFTKHFLICTRS 
SGG KLHLLKTGGVLS L IDCTL IEEPDASDDDS KG SGQVFGHLDF 
KI WEPPDRAAFTWLLAPSRQEKAAWMSD I SQC VDN IRCNGLM 
TIVFEENSKVTVPHMIKSDARLHKDDTDICFSKTLNSCKVPQIR 
YASVERLLERLTDLRFLS IDFLNTFLHTYRI FTTAAWLGKLSD 
I YKRPFTS I P VRSLELFFATSQNNRGEHLVDGKS PRLCRKFSSP 
PPLAVSRTS S PVRARKLSLTS PLNS KIGALDLTTS SS PTTTTQS 
PAAS P P PHTGQ I PLDLS RGLS S P EQS PGTVB ENVDNPRVD L CNK 
LKRS IQKAVLESAPADRAGVESS PAADTTELS PCRS PSTPRHLR 
YRQPGGQTADNAHCS VS PASAFAI ATAAAGHGS P PGFNNTERTC 
DKE F 1 1 RRTATNRVLNVLRHWVS KHAQ DFELNNE LKMNVLNLLE 
EVLRDPDLLPQERKAAANILMALSQDDQDDIHLKLEDIIQMTDC 
MKAECFESLSAMELAEQ ITLLDHVI FRS IPYEEFLGQGWMKLDK 
NERT P Y I MKTS QH FNDMSNLVASQ I MN YADVS S RANA I E KWVAV 
AD I CRCLHNYNGVLE ITSALNRSAI YRLKKTWAKVSKQTKALMD 
KLQ KTVS SEGRFKNLRETLKNCNP PAVP YLGMYLTDLAFI EEGT 
PNFTEEGLVNFS KMRM I S H 1 1 REI RQFQQTS YR I DHQPKVAQ YL 
LDKDLI IDEDTLYELSLKIEPRLPA 


6543 


1857 


950 


F VS GCGRAG I GLS WAMAAEAR VSR W Y FGGLAS CGAACCTH PLDL 
LKVHLQTQQE VKLRMTGMALRWRTDG I LALYSGLS ASLCRQMT 
YSLTR FAI YETVRDRVAKG S QGPL P FHEKVLLGS VS G LAGG FVG 
TPADLVNV^QNDVKLPQGQRRNYAHALDGLYRVAREEGLRRLF 
S GATMAS S RGALVT VGQLS CYDQAKQLVLSTGYLSDNI FTHFVA 
S FIAGGCATFLCQPLDVLKTRLMNS KGE YQGVFHCAVETAKLGP 
LAFYKGLVPAG I RLI PHTVLTFVFLEQLRKNFGI KVPS 


6544 


630 


79 


PSPCFIRSRLDGQPWMAGLEAWLSQNFSLHQPQSRVRVRRASIS 
EPSDTDPEPRTLNPSPAGWFVQQHPELELMSSFRERFGRNWLQY 
RSHLEPSGNPLPATPTTSAPSAPPASSQGPDTAPRPSPPQEEAR 
GPQES PQKMSEEVRAEPQEEEEEKEGKEEKEEGEMAPLPEAHIX3 
EGKQKECP 


6545 


176 


560 


P PHSHAALLPAAMTPLLTL I LWLMGLPLAQALDCHVCAYNGDN 
CFNPMRCPAMVAYCMTTRTYYTPTRMKVSKSCVPRCFETVYDGY 
S KHAS TTS COQ YDLCNGTGLATPATLALAP I LLATL WGLL 


S546 


1657 


364 


HLLNGLDEVAAFFVADLGAIVRKHFCFLKCLPRVRPFYAVKCNS 
S PG VL KVLAQLGLG F S CANKAEMELVQH I G I PASK 1 1 CANPCKQ 
IAQIKYAAKHGIQLLSFDNEMELAKWKSHPSAKMVLCIATDDS 
HSLSCLSLKFGVSLKSCRHLLENAKKHHVEVVGVSFHIGSGCPD 
PQAYAQS IADARLVFEMGTELGHKMHVLDLGGGFPGTEGAKVRF 
EEIASVINSALDLYFPEGCGVDIFAELGRYYVTSAFTVAVSIIA 
KKE VLLDQ PGREE ENGS TS KT i VYHLDEGVYG I FNS VL FDNI CP 
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to first 
amino acid 
residue of 
amino acid 
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Predicted end 
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location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D=Aspartic Acid, 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W^Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /"possible nucleotide deletion, 
\«=possible nucleotide insertion) 








TP I LQ KKPS TEQ P L YS SS LWGPAVDGCDCVAEGL WLP QLHVGDW 
LVFDNMGAYTVGMGS P F WGTQACH IT YAMSRVAWEALRRQLMAA 
EQEDD VEGVCKPLS CGWEITDTIiCVGPVFTPAS IM 


6547 


1 


541 


LHSKYLAPALCSQPGMMRCCRRRCCCRQPPHALRPLLLLPLVLL 
PPLAAAAAGPNRCDTIYQGFAECLIRLGDSMGRGGELETICRSW 
NDFHACAS QVLSGCPEEAAAVWES LQQEARQAPRPNNLHTLCGA 
P VHVRERGTGS ETNQETLRATAPALPMAPAP PLLAAALALAYLL 
RPLA 


6548 


2 


219 


FVSRLSVRDVRFPTFLGGHGADAMHTDPDYSAAYVPIETDAEDG 
IKGCGITFTLGKGTEVGELKILSRFQNA 


6549 


73 


1490 


ETGRVCEDARPACGSRSRRRRKEAAPGIPTPSPSSSSPTSSRPA 
ARAFSKAPARIiSRPRAREEPPDPGRRYIQEEIIQARKHKLIKMC 
SSVAAKLWFLTDRRIREDYPQKEILRALKAKCCEEELDFRAVVM 
DEWLTIEQGNLGLRINGELITAYPQWWRVPTPWVQSDSDIT 
VLRHLE KMGCRLMNRP QA I LNCVNKFWT FQELAGHGVPL PDT FS 
YGGHENFAKMIDEAEVLEFPMVVKNTRGHRGKAVFLARDKHHLA 
DLSHLIRHEAPYLFQKYVKESHGRDVRVIWGGRWGTMLRCST 
DGRMQSNCSLGGVGMMCSLSEQGKQLAIQVSNILGMDVCX3IDLL 
MKDDGSFCVCEANANVGFIAFDKACNLDVAGI IADYAASLLPSG 
RLTRRMSLLSWSTASETSEPELGPPASTAVDNMSASSSSVDSD 
PES TERELLT KLPGG L FNMNQLLANE I KLL VD 


6550 


2293 


922 


FRVSRDGAPDCGIEQMGIiAMEHGGSYARAGGSSRGCWYYLRYFF 
LFVSLIQFLIILGLVLFMVYGNVHVSTESNLQATERRAEGLYSQ 
LLGLTASQSNLTKELNFTTRAKDAIMQMWLNARRDLDRINASFR 
QCQGDRV I YTNNQR YMAAI I LS E KQ CRDQ FKDMNKS CDALL FML 
NQKVKTLEVE IAKEKTI CTKDKESVLLNKRVAEEQLVECVKTRE 
LQHQERQLAKEQLQKVQALCLPLDKDKFEMDLRNL WRDS IIPRS 
LDNLG YNL YHPLGSELAS I RRACDHMP S LMS S KVEELARSLRAD 
I ERVARENSDLQRQKLEAQQGLRAS QEAKQKVEKEAQAREAKLQ 
AECSRQTQLALEEKAVLRKERDNLAKELEEKKREAEQLRMELAI 
RNSALDTCIKTKSQPMMPVSRPMGPVPNPQPIDPASLEEFKRKI 
LESQRPPAGI PVAPSSG 


6551 


157 


748 


IQPPDPRNMTIAAYKEKMKELPLVSLFCSCFLADPLNKSSYKYE 
ADTVDLNWCVI S DMEVI ELNKCTSGQS FE VI LKPPS FDGVP E FN 
ASLPRRRDPSLEEIQKKLEAAEERRKYQEAELLKHLAEKREHER 
EVIQKAIEENNNFIKMAKEKLAQKMESNKENREAHLAAMLERLQ 
EKDKHAEEVRKNKELKEEASR 


6552 


157 


748 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFLADPLNKSSYKYE 
ADTVDLNWC V I S DME V I E LNKCTSGQS FEVI LKP PS FDGVP E FN 
ASLPRRRDPSLEEIQKKLEAAEERRKYQEAELLKHLAEKREHER 
E VIQKAI EENNNFI KMAKEKLAQKMESNKENREAHLAAMLERLQ 
EKDKHAEEVRKNKELKEEASR 


6553 


2 


1807 


FVWS KMAAHLS YGRVNLNVLREAVRRELREFLDKCAGSKAI VWD " 
E YLTG P FG LIAQ YS LLKEHE VE KMFTL KGNRL PAADVKNI I FFV 
RPRLELMDIIAENVLSEDRRGPTRDFHILFVPRRSLLCEQRLKD 
I^VLGSFIHREBYSLDLIPFDGDljLSMESEGAFKECYLEGDQTS 
LYHAAKGLMTLQALYGTIPQI FGKGECARQVANMM IRMKREFTG 
SQNS I FPVFDNLLLLDRNVDLLTPLATQLTYEGLIDE I YGIQNS 
YVKLPPEKFAPKKQGDGGKDLPTEAKKLQLNSAEELYAEIRDKN 
FNAVGS VLSKKAKI I SAAFEERHNAKTVGE I KQFVSQL PHMQAA 
RGSLANHTSI AELI WDVTTSEDFFDKLTVEQEFMSG IDTDKVNN 
Y I E DC I AQ KHSL I KVLRLVCLQSVCNS G L KQ KVLD Y YKRE I LQT 
YG YEH I LTLHNL E KAG LLKPQTGGRNNY PT IRKTLRLWMDDVNE 
QNPTDISYVYSGYAPLSVRLAQLLSRPGWRSIEEVLRILPGPHF 
E E RQP L PTGLQKKRQ PGENRVTLI FFLGGVT FAE IAALRFLS QL 
EDGGTE YVIATTKLMNGTS W I EALMEKP F 
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co r r e a pond i ng 

to first 
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Predicted end 
nucleotide 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, VeValine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, +-Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 


6554 


119 


1244 


FEMGSQVSVESGALfCWrVGGGFGGIAAASQLQALNVPFMLVDM 
KDS FHHNVAALRASVETG FAKKTFI S YSVTFKDNFRQGL VVGID 
LKNQMVLLQGGEALPFSHLILATGSTGPFPGKFNEVSSQQAAIQ 
AYEDMVRQVQRSRF I VWGGGSAGVEMAAE I KTE YPEKEVTLIH 
S QVALADKE LL PS VRQE VKE I LLRKGVQLLLS E RVSNLE E LPLN 
EYREYIKVQTDKGTEVATNLVILCTGIKINSSAYRKAFESRLAS 
SGALRVNEHLQVEGHSNVYAIGDCADVRTPKMAYLAGLHANIAV 
AN I VNS VKQR PLQAYKPGALT FL LS MG RNDGVGQ I SG FYVGRLM 
VRLTKSRDLFVSTSWKTMRQSPP 


6555 


1552 


498 


IHMALLRKINQVLLFLLIVTLCVILYKKVHKGTVPKNDADDESE 
TPE ELEEE I PWICAAAGRMGATMAAINS I YSNTDANI LF YWG 
LRNTLTRIRKWIEHSKLREINFKIVEFNPMGLKGKIRPDSSRPE 
LLQPLNFVRFYLPLL IHQHEKVI YLDDDVIVQGD IQELYDTTLA 
LGHAAAFSDDCDLPSAQDINRLVGLQNTYMGYLDYRKKAIKDLG 
ISPSTCSFNPGVIVANMTEWKHQRITKQLEKWMQKNVEENLYSS 
SLGGGVATS PML I VFHGKYS TINPLWH I RHLG WNPDARYS EHFL 
QEAKLLHWNGRHKP WDFP S VHNDLWES WFVPDPAGI FKLNHHS 


6556 


241 


1449 


ASLCKGCFFVTHVLVIILPSLQSPPTFGFLLDIDGVLVRGHRVI 
PAALKAFRRLVNSQGQLRVPWFVTNAGNILQHSKAQELSALLG 
CEVDADQVILSHSPMKLFSEYHEKRMLVSGQGPVMENAQGLGFR 
NVVTVDELRMAFPLLDMVDLERRLKTTPLPRNDFPRIEGVLLLG 
EPVRWETSLQL I MDVLLSNGS PGAGLATPPYPHLPVLASNMDLL 
WMAEAKMPRFGHGTFLLCLETIYQKVTGKELRYEGLMGKPSILT 
YQYAEDLIRRQAERRGWAAPIRKLYAVGDNPMSDVYGANLFHQY 
LQKATHDGAPELGAGGTRQQQPSASQSCISILVCTGVYNPRNPQ 
STEPVLGGGEPPFHGHRDLCFSPGLMEASHWNDVNEAVQLVFR 
KEGWALE 


6557 


2598 


1534 


RMCGRTSCHLPRDVLTRACAYQDRRGQQRLPEWRDPDKYCPSYN " 
KSPQSNSPVLLSRLHFEKDADSSERIIAPMRWGLVPSWFKESDP 
S KLQ FNTTNCRS DTVMEKRS FKVPLGKGRRCWLADGFYE WQRC 
QGTNQRQ P YF IYFPQIKTEKSGS IGAADS PENWE KVWDNWRLLT 
MAG I FDCWEPPBGGDVLYS YT 1 1 TVDS CKGLSD I HHRMPAI LDG 
E EAVS KWLDPGE VS TQEALKL I HPTEN I T FHAVS S WNNSRNNT 
PECLAPVDLVVKKELRASGSSQRMLQWIiATKSPKKEDSKTPQKE 
ESDVPQWSSQFLQKSPLPTKRGTAGLLEQWLKREKEEEPVAKRP 
YSQ 


CCCQ 

bbbo 


21 


1138 


FHGRRRGGRKMELGSCLEGGREAAEEEGEPEVKKRRLLCVEFAS 
VASCDAAVAQCFLAENDWEMERALNSYFEPPVEESALERRPETI 
SEPKTYVDLTNEETTDSTTSKISPSEDTQQENGSMFSLITWNID 
GLDLNNLSERARGVCSYLALYSPDVIFLQEVIPPYYSYLKKRSS 
NYE I ITGHEEGYFTA I MUCKS RVKLKS QE 1 1 PFPS TKMMRNLLC 
VHVNVSGNELCLMTSHLESTRGHAAERMNQLKMVLKKMQEAPES 
A WI FAGDTNLRDRE VTRCX3GLPNNI VD VWE FLX3KP KHCQ YTWD 
TQMNSNLGITAACKLRFDRIFFRAAAEEGHIIPRSLDLLGLEKL 

DCGRFPSDHWGT.T.PMT.nTTT, 


6559 


3 


364 


GPELSGLPTRPKKLKANQTPlAMDCCASRSCSVPTGPATTICSS 
DKS CRCGVCLPSTCPHTVWLLE PTCCDNCPPPCHI PQPCVPTCF 
LLNSCQPTPGLETLNLTTFTQPCCEPCLPRGC 


6560 


3 


1435 


TATSGGIWLRRKWRCHWPRPLPQSCVGTEGGLQVRDTSSRIAKG 
GVDHTKMSLHGASGGHERSRDRRRSSDRSRDSSHERTESQLTPC 
IRNVTS PTRQHHVEREKDHSSSRPSS PRPQKAS PNGS ISSAGNS 
S RNS SQS S S DGS CKTAGEMVFVYENAKEGARNI RTS ER VTL IVD 
NTRFWDP S I FTAQPNTMLGRMFGSGREHNFTRPNE KGEYEVAE 
GIGS TVFRAI LD YY KTG 1 1 RCPDG I S I PELREACDYLC I S FEYS 
TIKCRDLSALMHELSNDGARRQFEFYLEEMILPLMVASAQSGER 
ECHIWLTDDDWDWDEEYPPQMGEEYSQIIYSTKLYRFFKYIE 
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ID 
NO: 


Predicted 
beginning 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino an H PAfrmPnh rnnt-ain'i'nrt cirrnal mawf i 

rutixuu at*j.u bc^uiciiu luulcj hi iiiy sagnax peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, S= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H«Hietidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NRDVAKSVLKERGLKKIRLGIEGYPTYKEKVKKRPGGRPEVIYN 
YVQRPFIRMSWEKEEGKSRHVDFQCVKSKS ITNLAAAAADI PQD 
QLWMHPT PQVDELD I LP IHPPSGNSDLDPDAQNPML 


6561 


3 


1086 


PGRRFRRKESSSSRWFPADCLLGLRGPASSLLSPEPSPSWPSHS 
PCPMAALTDLSFMYRWFKNCNLVGNLSEKYVFITGCDSGFGNLL 
AKQL VDRGMQ VLAACFTEEG SQ KLQRDTS YRLQTTLLD VTKSE S 
I KAAAQ WVRD KVG E QG LW AL VNN AGVGL PS G PNE WLT KDD FVKV 
INVNLVGLI EVTLHMLPMVKRARGRWNMS SSGGRVAVIGGGYC 
VS KFG VE AF S DS I RRE L YYFG VKVCI I E PGN YRTAI LGKENLE S 
RMRKLWERLPQETRDSYGEDYFRIYTDKLKNIMQVAEPRVRDVI 
NS M EHAI VSRS PR I R YNPGLDAKLLY I P LAKLPT P VTD FI LSRY 

T.OPDZVnCV 
JjrJX xrrvUo V 


6562 


1 


1562 


MSTLYDIRAHKAQLLRFFASSDSNKALEQRRTLHTPKLEHLDRV 
L YEWFLGKRSEGVP VS GPMLI EKAKDFYEQMQLTE PCVFSGGWL 
WRFKARHGIKKLDASSEKQSADHQAAEQFCAFFRSLAAEHGLSA 
EQVYNADETGLFWRCLPNPTPEGGAVPGPKQGKDRLTVLMCANA 
TGSHRLKPLAIGKCSGPRAFKGIQHLPVAYKAQGNAWVDKEIFS 
DWFHHIFVPSVREHFRTIGLPEDSKAVLLLDSSRAHPQEAELVS 
SNVFTIFLPASVASLVQPMEQGIRRDFMRNFINPPVPLQGPHAR 
YNMNDAI FS VACAWNAVP SHVFRRAWRKLWPSVAFAEGSSSEEE 
LE AECF P VKPHNKS FAH I LELVKEGSS C PGQLRQRQAAS WGVAG 
REAEGGRPPAATSPAEWWSSBKTPKADQDGRGDPGEGEEVAWE 

y/V\ V rVT Un V JjK c nfiKy f \* c v LiK/iliKA V r Ko y y y VRRRR 

GALGAWKVEAIiQEGPGGCGATAQS PLP CSSTAGDN 


6563 


1319 


2694 


LARPAQPVXjLREPEGAGPPVPAGHLVHHLQGGHLRERAHPDLEA 
HEHPLP CDQMFWRQMGGHLRMVE ANSRG WWG I GYDHTAWVYTG 
G YGGGCFQGLAS STSNI YTQSDVKCVHI YENQRWNPVTGYTSRG 
LPTDRYMWSDASGLQECTKAGTKPPSLQWAWSDWPVDFSVPGG 
TDQEGWQYASDFPASYHGSKTMKDFVRRRCWARKCKLVTSGPWL 
EVP P IALRDVS 1 1 PES PGAEGSGHS I ALWAVS DKGD VLCRLG VS 
a uiv f/\vjo o w i»ri v \j x Jjy Jr r J\a ± a i (aA^ iy vwAVAKUvjijArx RGSV 
YPS Q PAGD CW YH I PS P PRQRLKQVS AGQTS VYALDENGNLW YRQ 
GI TPS YPQGS S WEHVSNNVCRVS VGPLDQVWVr ANKVQGSHSLS 
RGTVCHRTGVQPHEPKGHGWDYGIGGGWDHISVRANATRAPRSS 
SQEQEPSAPPEAHGPVCC 


6564 


1 


975 


APGSCALWSYCGRGWSRAMRGCQIJJGLRSSWPGDLLSARLLSQE 
KRAAETHFGFETVSEEEKGGKVYQVFESVAKKYDVMNDMMSLGI 
HRVWKDLLLW KMHPLPGTQLLDVAGGTGDI AFRFLNYVQS QHQR 
KQKRQLRAO^NLSWEEIAKEYQNEEDSLGGSRVWCDINKEMIiK 
VGKQKALAOX5YRAGLAWVLGDAEELPFDDDKFDI YTIAFG IRNV 
THIDQALQEAHRVLKPGGRFLCLEFSQVNNPLISRLYDLYSFQV 
IPVLGEVIAGDWKSYQYLVESIRRFPSQEEFKDMIEDAGFHKVT 
YESLTSGIVAIHSGFKL 


6565 


1464 


999 


RSAVANGLTKRRMGLKLNGRYISL I LAVQ IAYLVQAVRAAGKCD 
AVFKGFSDCLLKLGDSMANYPO^LDDKTNIKTVCTYWEDFHSCT 
VTALTDCQEGAKDMWDKLRKESKNLNIQGSLFELCGSGNGAAGS 
LLPAF P VLLVS LS AALAT WLS F 


6566 


3 


1385 


KYESAQPGGTQPEPGLGARMAIHKALVMCLGLPLFLFPGAWAQG 
HVPPGCSQG^PLYYNLCDRSGAWGIVLEAVAGAGIVTTFVLTI 
ILVASLPFVQDTKKRSLLGTQVFFLLGTLGLFCLVFACVEKPDF 
STCASRRFLFGVLFAICFSCLAAHVFALNFLARKNHGPRGWVIF 
TVALLLTLVEVI INTEWLI ITLVRGSGEGGPQGNSSAGWAVASP 
CAIANMDFVMAL I YVMLLLIjGAFLGAWPALCGRYKR WRKHG VFV 
LLTTATSVAIWVWIVMYTYGNKQHNSPTWDDPTLAIALAANAW 
AFVLFYVIPEVSQVTKSSPEQSYQGDMYPTRGVGYETILKEQKG 
QSMFVENKAFSMDEPVAAKRPVSPYSGYNGQLLTSVYQPTEMAL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline. 0=Glutamine R-Aroininp 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=*possible nucleotide insertion) 








MHKVPSEGAYD 1 1 LPRATANSQVMGS ANSTLRAEDMYSAQSHQA 
. ATPP KDGKNSQVFRNP YVWD 


6S67 


125 


863 


TKRSNLKAYACS I HH I RTMS YVF VNDS SQTNVP LLQAC I DGDFN 
YSKRLLESGFDPNIRDSRGRTGLHLAAARGNVDICQLLHKFGAD 

AKRRGVNKDVIRLLESLEEQEVKGFNRGTHSKLETMQTAESESA 
MESHS LLNPNLQQGEGVLSS FRTTWQEFVEDLGFWRVLLLI FVI 
ALLS LG I AY Y VS G VL P F VENQ PE LVH 


6568 


3 


1183 


HASDRLLVLPDNYSHFSQASANLQGPSRTTELFHPTLASISSPM 
LEGAELYFNVDHGYLEGLVRGCKASLLTQQDYINLVQCETLEDL 
KIHLQTTDYGNFLANHTNPLTVSKIDTEMRKRLCGEFEYFRNHS 
LEPLSTFLTYMTCSYMIDNVILLMNGALQKKSVKEILGKCHPLG 
RFTEMEAVNIAETPSDLFNAILIETPLAPFFQDCMSENALDELN 
I E LLRNKL YKS YLEAFY KFC KNHGDVTAE VMC P I LEFEADRRAF 
1 1 TLNS FGTELS KEDRBTL Y PTFGKL Y P EGLRLLAQAED FDQMK 
wv/tuniuv xi\JrijriiHVUVjoo«I\IJjr.lJVr lEREVQMNVJLiAFNRQF 
H YG VF YA YVKL KEQE I RN I VW I AE CIS QRHRT K INS Y I P I L 


6569 


205 


1532 


RRRGPQRLGHGRPTPLLCRWRTAGPSHWEKQARAFQGLRPVDPR 
RMSWLFPLTKSASSSAAGSPGGLTSLQQQKQRLIESLRNSHSSI 
AEIQKDVEYRLPFTINNLTININILLPPQFPQEKPVISVYPPIR 
HHLMD KQG VYVTS PL VNNFTMHS D LGK 1 1 QSLLDE FWKN P P VLA 
PTSTAPPYLYSNPSGMSPYASQGFPFLPPYPPQEANRSITSLSV 
ADTVSS STTSHTTAKPAAPSFGVLSNLPLPI PTVDAS I PTSQNG 
c\jXjsmyuveUAc FtibbfijJjoVbUiJADMNEQEEVLLEQFLTLPQLK 
QI ITDKDDLVKS IEELARKNLLLEPSLEAKRQTVLDKYELLTQM 
KSTFEKKMQRQHELSESCSASALQARLKVAAHEAEEESDNIAED 
FLEGKMEIDDFLSSFMEKRTICHCRRAKEEKLQQAIAMHSQFHA 
PL 


6570 


330 


1304 


ARL PRLT FLREG FL YVLLSHWVFVGAP R P PAS DS WKKGL VP S AP 
PAS RKMGS KAL PAP I PLH P S LQLTNYS FLQAVNT FPAT VDHLQG 
LYGLSAVQTMHMNHWTLGYPNVHEITRSTITEMAAAQGLVDARF 
P FPAL P FTTHLFHP KQGA I AHVL PALH KDRPR FDFANIAVAATQ 
EDPPKMGDLSKLS PGLGSP I SGLSKLTPDRKPSRGRLPSKTKKE 
FICKFCGRHFTKSYNLLIHERTHTDERPYTCDICHKAFRRQDHL 
RDHRY I HS KEKPFKCQECX3KGFCQSRTLAVHKTLHMQTSS PTAA 
SSAAKCSGETVICGGT 


6571 


169 


656 


APDMNRKKLQKLTDTLTKNCKH LFRG FDKDNDG C VNVLEW I HGL 
S L FLRGS LE EKMK YCFE VFDLNGDGF I SKEEMFHMLKNS LL KQ P 
SEEDPDEG I KDLVE I TLKKMDHDHDGKLS FADYELAVREETLLL 
EAFGPCL PDPKSQME FEAQVFKDPNEFNDM 


6572 


49 


1646 


LS CSERHQKLVDENYCKKLHVQALKNVNSQ I RNQMVQNENDNRV 
QRKQFLRLLQNEQFELDMEEAIQKAEENKRLKELQLKQEEKLAM 
ELAKLKHESLKDEKMRQQVRENSIELRELEKKLKAAYMNKERAA 
Q IAEKDAI KYEQMKRDAEIAKTMMEEHKRI IKEENAAEDKRNKA 
KAQYYLDLEKQLEEQEKKKQEAYEQLLKEKLMIDEIVRKIYEED 
QLEKQQKLEKMNAMRRYIEEFQKEQALWRKKKREEMEEENRKII 
EFANMQQQREEDRMAKVQENEEKRLQLQNALTQKLEEMLRQRED 
LEQ VRQEL YQ E EQAE I YKSKL KEEAEKKLRKQ KEMKQDFEEQMA 
LKE L VLQAAKE EEENFRKTMLAKFAEDDR I E LMNAQ KQRMKQLE 
HRRAVEKLIEERRQQFLADKQRELEEWQLQQRRQGFINAIIEEE 
RLKLLKEHATNLLGYLPKGVFKKEDDIDLLGEEFRKVYQQRSEI 
CEEK 


6573 


767 


275 


GGGGGESQSFRAQDGTRTPATDCLMYLQGPRKLMTQGGYDMVQK 
LFLDFFRRRLSQRPTAEELEQRNILKPRNEQEEQEEKREIKRRL 
TRKLSQRPTVEELRERKILIRFSDYVEVADAQDYDRRADKPWTR 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 

Ico.UUc OX 

amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R«=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X»Un3tnown, *»Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








LTAADKVSRGECWRVGGRTVCWVSLGSPLGSV 


6574 


204 


1159 


L E S S VP VS VGVF WACGVS WTGAAGLQDGALS DTMARNAEKAMTA 
LARFRQAQLEEGKVKERRPFLASECTELPKAEKWRRQIIGEISK 
KVAQIQNAGLGEFR I RDLNDE INKLLREKGHWEVRI KELGGPDY 
G KVGP KMLDHEGKE VPGNRG Y K YFG AAKDLPGVRELFE KE P LP P 
PRKTRAELMKAIDFEYYGYLDEDDGVIVPLEQEYEKKLRAELVE 
KWKAEREARLARGEKEEEEEEEEEINIYAVTEEESDEEGSQEKG 
GDDSQQKFIAHVPVPSQQEIEEALVRRKKMELLQKYASETLQAQ 
SEEARRLLGY 


6575 


117 


820 


SPALASQSGGITEEKMLEPQENGVIDLPDYEHVEDETFPPFPPP 
ASPERQDGEGTEPDEESGNGAPVPVPPKRTVKRNIPKLDAQRLI 
S ERGLPALRHVFDKAKFKGKGHEAEDLKML I RHMEHWAHRLFP K 
LQ FEDF I DRVE YLGS KKE VQTCL KR IRLDLP I LHED FVSNNDE V 
AENNEHDVTSTELDPFLTNLSESEMFASELSISLTEEQQQRIER 
NKQLALERRQAKLP 


" 6576 


1 


1060 


P E PQALVG Q KRGALRLLVARLVLT VSAPAE VRRR VLRP VL S WMD " 
RETRALADSHFRGLGVDVPGVGQAPGRVAFVSEPGAFSYADFVR 
GFLLPNLPCVFSSAFTQGWGSRRRWVTPAGRPDFDHLLRTYGDV 
WPVANCGVQEYNSNPKEHMTLRDYITYWKEYIQAGYSSPRGCL 
YLKDWHLCRDF PVEDVFTLPVYFS SDWLNEFWDALDVDDYRFVY 
AGPAGS WS P FHADI FRS FS WSVNVCGRKKWLLFP PGQEEALRDR 
HGNLP YDVTS PALCDTHLHPRNQLAGPPLE ITQEAGEMVFVPSG 
WHHQVHNLVMCCFSCPLSGAFLQEDGSTTSPLSQPELGWNGVAH 
G 


6577 


2271 


987 


SDRMASDDFDIVIEAMLEAPYKKEEDEQQRKEVKKDYPSNTTSS 
TSNSGNETSGSSTIGBTSNRSRDRDRYRRRNSRSRSPGRQCRHR 
SRS WDRRHGS ESRSRDHRREDRVH YRS PPLATG YRYGHS KS PHF 
REKSPVREPVDNLSPEERDARTVFCMQLAARIRPRDLEDFFSAV 
GKVRD VRI I S DRNSRRS KG I A YVE FCE I QS VPLAI GLTGQRLLG 
VP I IVQASQAEKNRIAAMANNLQKGNGGPMRLYVGSLHFNITED 
MLRGI FEPFGKIDNIVLMKDSDTGRS KGYGFITFSDSECARRAL 
EQLNG FELAGR P MRVGHVTERLDGGTD I TFPDGDQELDLG S AGG 
RFQLMAKLAEGAG I QLPS TAAAAAAAAAAQAAALQLNGAVPLGA 
LNPAALTALSPALNLASQCLQLSSLFTPQTM 


' £578 


377 


1489 


PSSSATMNRAPLKRATILHMALTGASDPSAEAEANGEKPFLLRA 
LQIALWSLYWVTS ISMVFLNKYLLDSPSLRLDTP IFVTFYQCL 
VTTLLCKGLSALAACCPGAVDFPSLRLDLRVARSVLPLSWFIG 
MIT FNNLCLKYVG VAFYNVGRS LTTVFNVLLS YLLLKQTTS FYA 
LLTCGI I IGGFWLGVDQEGAEGTLSWLGTVFGVLASLCVSLNAI 
YTTKVLPAVDGS I WRLTFYNNVNACILFLPLLLLLGELQALRDF 
AQLGSAHFWGMMTLGGLFGFAIGYVTGLQ I KFTS PLTHNVSGTA 
KACAQTVLAVLYYEETKSFLWWTSNMMVLGGSSAYfWVRGWEMK 
KTPEEPSPKDSEKSAMGV 


6579 


2 


711 


RPPRVW Y PELRE L S AAAPR WS HRTAPG I M VF Y FTS SS VNS S AYT 

EDIPKEVLMDCAHLVKANS IQGCKMNNVNWYTPWSNLKKTADM 
DVGQIGFHRQKDVKIVTVEKKVNEILNRLEKTKVERFPDLAAEK 
ECRDREERNEKKAQ I QEMKKRE KEEMKKKREMDELRS YSSLMKV 
ENMSSNQDGNDSDEFM 


6580 


62 


1571 


LVALKNWKPKGTNI PAPQSPVFGEAVSGVYMMTKVLGMAPVLGP "" 

RPPQEQVGPLMVKVEEKEEKGKYLPSLEMFRQRFRQFGYHDTPG 

PREALSQLRVLCCEWLRPEIHTKEQILELLVLEQFLTILPQELQ 

AWVQEHC PE S AE EAVTLLEDLERE LDE PGHQ VSTP PNEQKPVWE 

KISSSGTAKESPSSMQPQPLETSHKYESWGPLYIQESGEEQEFA 

QDP RKVRDCRLSTQHEE S ADEQKGS EAEGLKGD I ISVI IANKPE 

ASLERQCVNLENEKGTKPPLQEAGSKKGRESVPTKPTPGERRYI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

locat* i on 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
vjiucaniic ncia, r=.rnenyi alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
u-Ltcu^iiic, n-i ic Liiiuiiine , in— as pa. ray me, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknovm, +=Stop 
Codon, /-possible nucleotide deletion, 
\opossible nucleotide insertion) 








s.Aucu[\nj; oiiJoitLii xvnrvrv i. n 1 IjCj IS.it I VL1 JVL.oZvvr uiIjoIN Li 1 l_i 

HYRTHLVDRPYDCKCGKAFGQSSDLLKHQRMHTEEAPYQCKDCG 
KAFSGKGS L IRH YR I HTGE KP YQ CNE CGKS FS QHAGLS SHQRLH 
TGEKPYKCKECGKAFNHSSNFNKHHRIHTGEKPYWCHHCGKTFC 
SKSNLSKHORVHTGEGHAP 


6581 


228 


476 


RVFLKDLS STPMASNNTAS I AQARKLVEQLKMEAN I DR I KVS KA 
AADLMAYCEAHAKEDPLLTPVPASENPFREKKFFCAIL 


6582 


1428 


718 


CFTTKTHCSPVSVPYLSPLVLRKELESLLENEGDQVIHTSSFIN 
QHPIIFWTLVWYFRRLDLPSNLPGLILTSEHCNEGVQLPLSSLS 
QDSKLVYIQLLWDNINLHQEPREPLYVSWRNFNSEKKSSLLSEE 
QQETSTLVETIRQSIQHNNVLKPINLLSQQMKPGMKRQRSLYRE 
ILFLSLVSLGRENIDIEAFDNEYGIAYNSLSSEILERLQKIDAP 
PSASVEWCRKCFGAPLI 


6583 


487 


41 


RI FSMTSGRLRWRCTWRPATALWSASLRLGTS SMHPS PRS ISLP 
LSMMLSPLPSNTRGLSPTALFRSPDSEHATSCPRLHLWRCRAPL 
RSP S PLGRLQVLPRS PLHVHTHNSGKE VLGLQVQRSRSGTGPAC 


5584 


189 


1750 


PLPMAALGPSSQim'EYVVRVPKNTTKKYNIMAFNAADKVNFAT 
WNQARLERDLSNKK I YQEE EMPES GAG SE FNR KLRE EARRKKYG 
I VLKE FRPEDQ P WLLRVNGKS GRKFKG I KKGGVTENTS YYI FTQ 
CPIX3AFEAFPVHNWYNFTPLARHRTLTAEEAEEEWERRNKVLNH 
FSIMQQRRLKDQDQDEDEEEKEKRGRRKASELRIHDLEDDLEMS 
SDASDASGEEGGRVPKAKKKAPLAKGGRKKKKKKGSDDEAFEDS 
DDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEEGPKGVDEQS 
DSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKDSSEESDSSE 
ESDIDSEASSAFFMAKKKTPPKRERKPSGGSSRGNSRPGTPSAE 
GGSTSSTLRAAAS KLEQGKRVSEMPAAKRLRLDTGPQS LSGKST 
PQPPSGKTTPNSGDVQVTEDAVRRYIiTRKPMTTKDLLKKFQTKK 
TGLSSEQTVNVLAQIIiKRLNPERKMINDKMHFSLKE 




-a 
.> 


1678 


GPI RNS RIDDFVGGDPRAEASCS VLHS KPHAMADS RDPAS DQMQ 
HWKEQRAAQKADVLTTGAGNPVGDKLNVITVGPRGPLLVQDVVF 
TDEMAHFDRER I PERWHAKGAGAFG YFEVTHD I TKYS KAKVFE 
H IGKKT P I AVRF S TVAG ES GS ADTVRD PRGFAVKF YTE DGNWDL 
VGNNTPIFFIRDPILFPSFIHSQKRNPQTHLKDPDMVWDFWSLR 
PESLHQVSFLFSDRGIPDGHRHMNGYGSHTFKLVNANGEAVYCK 
r n I i\ 1 jjy <jl Jt\£i Juo v iiUAAKJjo QiLu P D YG IRDLFNAIATGKYPSW 
TFY I Q VMTFNQAET FP FN P FDLTKVW PH KD YPL I P VGKL VLNRN 
PVNYFAE VEQ I AFDPSNMPPGI EAS PDKMLQGRLFAYPDTHRHR 
LGPNYLH I PVNCPYRARVANYQRDGPMCMQDNQGGAPNYYPNS F 
GAPEQQPSALEHS IQYSGEVRRFNTANDDNVTQVRAFYVNVLNE 
EQRKRLCENIAGHLKDAQI FIQKKAVKNFTEVHPDYGSHIQALL 
DKYNAE KP KNA I HT F VQS GSHLAARE KANL 


6586 


32 


804 


PLPEQPAESTSTMPVSGTPAPNKKRKSSKLIMELTGGGQESSGL 
NLGKKI S VPRDVMLEELSLLTNRGS KMFKLRQMRVEKFI YENHP 
D VFSDS S MDHFQ KFLPT VGGQLGTAGQG FS YS KSNGRGGS Q AGG 
SGSAGQYGSDQQHHLGSGSGAGGTGGPAGQAGRGGAAGTAGVGE 
TGSGDQAGGEGKHITVFKTYISPWERAMGVDPQQKMELGIDLLA 
YGAKAEL P KYKS FNRTAM P YGG YEKAS KRMTFQM P KV 


6587 


75 


1117 


. RRVPSLGKMPECWDGEHDI ETPYGLI*HWIRGSPKGNRPAILTY 
HDVGLNHKLCFNTFFNFEDMQE I TKHFWCHVDAPGQQVGASQF 
PQGYQFPSMEQLAAMLPS VVQHFGFKYVIG IGVGAGAYVLAKFA 
LIFPDLVEGLVLVNIDPNGKGWIDWAATKLSGLTSTLPDTVLSH 
LFSQEELVNNTELVQSYRQQIGNWNQANLQLFWNMYNSRRDLD 
INRPGTVPNAKTLRCPVMLVVGDNAPAEDGVVECNSKLDPTTTT 
FLKMADSGGLPQVTQPGKLTEAFKYFLQGMGYMPSASMTRLARS 
RTASLTSASSVDGSRPQACmiSESSEGLGQVNHTMEVSC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D«Aspartic Acid, E» 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6588 


137 


501 


LGLQAQLLELRTNNYQLSDELRKNGVELTSLRQKVAYLDKEFSK 
AOKALSKSKKAOEVEVIjLSENEMIiOAK'TiHQnPPnwPT.nMcjTT m& 
E FS KLCS Q MEQ LEQ ENQQL KEGAAGAGVAQAGP 


6589 


2 


1405 


RPWGSAMATFSRQEFFQQLLQGCLLPTAQQGLDQIWLLLAICLA 
CRLLWRLGLPSYLKHASTVAGGFFSLYHFFQLHMVWVVLLSLLC 
YLVLFLCRHSSHRGVFLSVTI L I YLLMGEMHMVDTVTWHKMRGA 
QMIVAMKAVSLGFDLDRGEVGTVPSPVEFMGYLYFVGTIVFGPW 
I S FHS YLQAVQGRPLSCRWLQKVARSLALALLCLVLS TCVGPYL 
FP YF I PLNGDRLLRNKKRKARGTMVRWLRAYESAVS FHFSNYFV 
G FLS EATATLAGAGFTEEKDHLE WDLTVS KPLNVEL PRS MVEW 
TSWNLPMSYWLNNYVFKKALRLGTFSAVLVTYAASALLHGFSFH 
LAAVLLSLAF I TYVEHVLRKRLAR ILSACVLS KRCP PDCSHQHR 
LGLGVRALNLLFGALAI FHLAYLGSLFDVDVDDTTEEQGYGMAY 


6590 


2177 


656 


VRAYEHVLS LLENVFTPMFCHRDE YFRQLLRGAES PTRNS KLNR 
GS LS LDDFRNTQKRGE SFGISRIGSKI KG VF KS TTMEGAMLPN Y 
GVAEGEDDF I E EGI WMEDDS PVEAVSTPNTPRNLAAWKI S I P Y 
VDFFEDPSSERKEKKERIPVFCIDVERNDRRAVGHEPEHWSVYR 
R YLE FYVLES KLTE FHGAFPDAQL PSKR 1 1 G PKNYEFLKS KREE 
FQEYLQKLLQHPELSNSQLLADFLS PNGGETQFLDKI LPDVNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
S PTSENNKKLFNDLFKNNANRAENTERKQNQNYFMEVMTVEGVY 
DYLMWGRWFQVPDWLHHLLMGTRILFKNTLEMYTDYYLQCKL 
EQLFQEHRLVSLITLLRDAIFCENTEPRSLQDKQKGAKQTFEEM 
MNYIPDLLVKCIGEETKYESIRLLFDGLQQPVLNKQLTYVLLDI 
» j. yii-ur cci uw xv v Uivii v i o v i a vivi 


6591 


2177 


656 


VRAYEHVLSLLEIWFTPMFCHRDEYFRQLLRGAESPTRNSKLNR 
GSLSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMLPNY 
GVAEGEDDFIEEGIWMEDDSPVEAVSTPNTPRNLAAWKISIPY 
VDFFEDPSSERKEKKERIPVFCIDVERNDRRAVGHEPEHWSVYR 
RYLEFYVLESKLTEFHGAFPDAQIiPSBCRIIGPKNYEFLKSKREE 
FQEYLQKLLQHPELSNSQLLADFLS PNGGETQFLDKI LPDVNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTSENNKKLFNDLFKNNANRAENTERKQNQNYFMEVMTVEGVY 
DYLM YVGRWFQ VPDWLHHLLMGTR I LFKNTLEMYTDY YLQCKL 
EQLFQEHRLVS L I TLLRDA I FCENTE PRSLQDKQKGAKQTFEEM 
MNYIPDLLVKCIGEETKYESIRLLFDGLQQPVLNKQLTYVLLDI 
VIQELFPELNKVQKEVTSVTSWM 


6592 


3 


1861 


APEFLGSTISSGSMIDANLKLLQEAEQRLKAIVAEKFAIATKEG 
DLPQVERFFKIFPLLGLHEEGLRKFSEYLCKQVASKAEENLLMV 

T/t}'!' 1 ! jMCnPRZk JMrTT?2\TTT , T. r T'T.T . CUf T & D TXrcTTJ^n TT rcT w/~i t>/~»ti 
Liu l ui'i ouRiuiH v xrwj ± ±j ± LtLiv aij ±J\s<.± Vciixiyi'i ViL 1 Y jfGPGR 

LYTLI KYLQVECDRQVEKVVDKFI KQRDYHQQFRHVQNNLMRNS 
TTEK I E P RELDP I LTEVTLMNARS E L YLRFLKKR I S S DFE VGDS 
MASEEVKQEHQKCLDKLLNNCLLSCTMQELIGLYVTMEEYFMRE 
TVNKAVALDTYEKGQLTSSMVDDVFYIVKKCIGRALSSSSIDCL 
CAMINLATTELESDFRDVLCNKLRMGFPATTFQDIQRGVTSAVN 
IMHSSLQQGKFDTKGIESTDEAKMSFLVTLNNVEVCSENISTLK 
KTLESDCTKLFS QG IGGEQAQAKFD SCLSDLAAVSNKFRDLLQE 
GLTELNSTAIKPQVQPWINSFFSVSHNIEEEEFNDYEANDPWVQ 
QFILNLEQQMAEFKASLSPVIYDSLTGI^TSLVAVELEKVVLKS 
TFNRLGGLQFDKELRSLIAYLTTVTTWT IRDKFARLSQMATI LN 
LERVT E I LD YWG PNSGPLTWRLTPAE VRQ VLALR I D FRSE D I KR 
LRL 


6593 


3 


1837 


EAFSAGS RRRGLALQRGVLGGLGG Y C P CCCRRRGRLL VLLLLVR 
RGGEGGGGRGRGDKRRRRQARRQRRRPE PAE7\RGGKMADVLS VL 
RQ YNIQKKE I WKGDEV I FG E FS WP KNVKTNYVVWGTGKEGQ PR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H*Histidine, I»Isoleucine , KsLysine, 
L*Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Un3cnown, *=Stop 
Codon, /=Dossible nucleotidp H«»l pt* i r»r» 
\=possible nucleotide insertion) 








EYYTLDS I LFLLNNVHLSHP VYVRRAATENI PWRRPDRKDLLG 
YliNGEASTSASIDRSAPLEIGLQRSTQVKRAADBVLAEAKKPRI 
EDEECVRLDKERLAARLEGHKEGIVQTEQIRSLSEAMSVEKIAA 
I KAK I MAKKRS T I KTDLDDD I TALKQRS FVDAE VD VTRDI VS RE 
RVWRTRTTILQSTGKNFSKNIFAILQSVKAREEGRAPEQRPAPN 
AAPVDPTLRTKQPIPAAYNRYDQERFKGKEETEGFKIDTMGTYH 
GMTLKSVTEGASARKTQTPAAQPVPRPVSQARPPPNQKKGSRTP 
1 1 1 IPAATTSLITMLNAIQ!)IjLnDT,KT^PQriPV , K , ifnpr , r>T?T< , TaT?TT 

IQRRKDQMQPGGTAISVTVPYRWDQPLKLMPQDWDRWAVFVQ 
GPAWQFKGWPWLLPDGSPVDIFAKIKAFHLKYDEVRLDPNVQKW 
DVT VLE LS YHKRHLDRP VFLR VWETLDR YMVKHKSHLRF 


6594 


1 


1096 


E F PGRR FRGS QAS P L CATOG P ALLRAP TRAAMTRS LF KGNFWS A 
D I LS T IG YDNI IQHLNNGRKNCKE FEDFLKERAAI EER YGKDLL 

ARKMEEFREKQKLQRKKTELIMDAIHKQKSLQFBCKTMDAKKNYE 
QKCRDKDEAEQAVSRSANLVNPKQQEKLFVKLATSKTAVEDSDK 
AYMLHIGTLDKVREBWQSEHIKACEAFEAQECERINFFRNALWL 
HVNQLSQQCVTSDEMYEQVRKSLEMCSIQRDIEYFVNQRKTGQI 
PPAPIMYENFYSSQKNAVPAGKATGPNLARRGPLPIPKSSPDDP 
NYS LVDD YSLLYQ 


6595 


57 


781 


PLGTMSDSDLGEDEGLLSLAGKRKRRGNLPKESVKILRDWLYLH 

P YNA VPCJT^nPk'T.CT.CnOTXTr C\7TJ*\T <" , MMI?T"M7i DDDT t nrvMr nvr\ 
roayriMioitouy lJNi-io VJjV ALXNWr XWAKKKJjjjPDMLjRKD 

GKDPNQFT I SRRGGKASDVALPRGS S PSVLAVS VPAPTNVLSLS 
VCSMPLHS GQGEKPAAPFPRGELE S PKPLVTPGSTLTLLTRAEA 
GS PTGGLFNTP P PTPPEQDKEDFS S FQLLVEVALQRAAEMELQK 
QQDPSLPLLHTP I PLVSENPQ 


6596 


2 


1026 


PRLPVRRYHGRRRLQGRSRGHMAEGDAGSDQRQNEEIEAMAAIY 
GEEWCVIDDCAKIFCIRISDDIDDPKWTLCLQVMLPNEYPGTAP 
P I YQLNAP WLKGQERADLSNSLEE I Y I QNIGES I LYLWVEKI RD 

FDI SETRTEVE VEELPP IDHGI P I TDRRSTFQAHLAPVVCPKQV 
KMVLSKLYENKKIASATHNI YAYR I YCEDKQTFLQDCEDDGETA 
AGGRLLHLME I LNVKNVMVWSRWYGGI LIiGPDRFKHINNCARN 
I LVEKNYTNS PEESSKALGKNFOCVRKDKKRNEH 


6597 


2 


1026 


PRLP VRRYHGRRRLQGRSRGHMAEGDAGSDQRQNEE I EAMAAI Y 
GEEWCVTDDCAKIFCTR T QnDTnrjPTTWTT.r'T.nvMT owcvuPTan 

PIYQLNAPWLKGQERADLSNSLEEIYIQNIGESILYLWVEKIRD 
VLIQKSQMTEPGPDVKKKTEEEDVECEDDLILACQPESSVKALD 
FDISETRTEVEVEELPP IDHGI PITDRRSTFQAHLAPWCPKQV 
KMVLS KL YENKK I ASATHNI YAYR I YCEDKQTFLQDCEDDGETA 
AGGRLLHLME I LNVKNVMWVS RW YGG I LLGPDR FKH INNCARN 
ILVEKNYTNSPEESSKALGKNKKVRKDKKRNEH 


6598 


1099 


419 


PR\mWATTMAMSFEWPWQYRFPPFFTLQPNVDTRQKQLAAWCSL 
VLSFCRLHKQSSMTVMEAQESPLFNNVKLQRKLPVESIQIVLEE 
LRKKGNLEWLDKSKSSFLIMWRRPEEWGKLIYQWVSRSGQNNSV 
FTL YELTNGEDTEDEE FHGLDEATLLRALQALQQEHKAE 1 1 TVS 
DGPRRQVLLAGTCLPLLLTSHLSRAFKRRQTQCPPKTGSVTPPD 
SKGLQS 


6599 


164 


1593 


KMAALTTLF KY I DENQDR Y I KKLAKWVAI QS VSAW P E KRGE I RR " 
MMEVAAADVKQLGGSVELVDIGKQKLPDGSEIPLPPILLGRLGS 
D P QKKT VC I YGHLDVQ P AALEDG WDSE P FTL VERDG KLHGRGS T 
DDKGPVAGWINALEAYQFCTGQEIPVNVRFCLEGMEESGSEGIiDE 
L I FARKDTF FKD VD YVC IS DNYWLGKKKP C I TYGLRG I C YFF I E 
VE CSNKDLHSGVYGGSVHEAMTDL I LLMGS LVDKRGN IL I PGIN 
EAVAAVT E EEHKLYDD I DFD I EE FAKD VGAQ I LLHS HKKDILMH 
RWR YPS LS LHGI EGAFSGSGAKTV I PR KWG KFS I RLVPNMTPE 
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SEQ 
ID 
NO: 


Predicted, 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


awiu ocy uitiiiL, cuucainin^ siyiiax peptide 
{A=Alanine, CoCysteine, D«Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
WcTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
i-uuuu, / -pubbiuit: iiucieoLiue aeieiion, 
\«possible nucleotide insertion) 








WGEQVTSYLTKKFAELRSPNEFKVYMGHGGKPWVSDFSHPHYL 
AGRRAMKTVFGVEPDLTREGGS I PVTLT FQEATGKNVMLLPVGS 
ADDGAHSQNEKLNRYWYIEGTKMLAAYLYEVSQLKD 


6600 


1 2 


934 


PGRkFRVAAMESAGLEQLLRELLLPDTERIRRATEQLQIVLRAP 

AALSALCDLLASAADPQIRQFAAVLTRRRLNTRWRRLAAEQRES 

IjI^IjIIjIAIjQKETEHCVSLSIiAQLSATIFRKEGLEAW 

QHS THS PH S PE REMGLLLLS VWTSRPEAFQ PHHRELLRLLNE T 

I^EVGSPGLLFYSLRTLTTMAPYLSTEDVPLARMLVPKLIMAMQ 

TL I P I DEAKACEALEALDELLESE VPVITP YLSEVLTFCLEVAR 

JN VAiAjW A 1 K I K I IA.L.I1 1 r L> v J\ V KS 1UUjIjKNRIjIiATLJUUiPFPHC 

GC 


6501 


529 


1420 


PRAAARAP PPAVLRRDRRAATAPGAGEMTLHGPLAQR YFLNHI E 
KI TTWQDPRKAMNQPLNHMNLH PAVS STPVPQRSMAVSQPNLVM 
NHQHQQQMAPSTLSQQNHPTQNPPAGLMSMPNALTTQQQQQQKL 
KXjU K x sffntS KiSK 1 KMKy h b LiMKy EAALiCRQL r MJSAE TLAP VQAAV 
NPPTMTPDMRS I TNNS SDPFLNGGP YHSREQS TDSGLGLGCYSV 
PTTPEDFLSNVDEMDTGENAGQTPMNINPQQTRFPDFLDCLPGT 
NVDLG TLE S ED L I PL FNDVE S ALNKS E P FLTWL 


6602 


127 


617 


LLDFPALPKFVLAQSPKAGKPSTMTSMTQSLREVIKAMTKARNF 
ERVLGKITLVSAAPGKVICEIVIKVEEEHTNAIGTLHGGLTATIjVD 
JNJ.O X JYI/vuJjV. J. bXUAFU vo VJJMN J.T I MbFAlujUhDIVITAHVTjKQ 

GKTLAFTS VDLTNKATGKL I aqgrhtkhlgn 


6603 


79 


660 


PVGPSSLAARTGLGHLPFLHRI^SRGIjDMDLLQFLAFLFVLLL 
SGMGATGTLRTSLDPSLEIYKKMFEVKRREQLLALKNLAQLNDI 
HQQYKILDVMLKGLFKVLEDSRTVLTAADVLPDGPFPQDEKLKD 
AFS HWENTAFFGD WLRFPR I VH Y YFDHNSNWNLL I RWG I S FC 
NQTGVFNQGPHS P I LSLM 


6604 


•a 


688 


TSTAQRQGGERMS FRGGGRGGFNRGGGGGGFNRGGS SNHFRGGG 
GGGGGGNFRGGGRGGFGRGGGRGGFNKGQDQGPPERVVLLGEFL 
HPCEDDIVCKCTTDENKVPYFNAPVYLENKEQIGKVDEIFGQLR 
DFYFSVKLSENMKASSFKKLQKFYIDPYKLLiPLQRFLPRPPGEK 
GPPRGGGRGGRGGGRGGGGRGGGRGGGFRGGRGGGGGGFRGGRG 
GGFRGRGH 


6605 


1 


848 


SGSRRGAMRAAGVGLVDCHCHLSAPDFDRDLDDVLEKAKKANW 
ALVAVAEHSGEFEKI MQLS ERYNGFVLPCLGVHP VQGLPP EDQR 
SVTLKDLDVALP 1 1 ENYKDRLLAIGEVGLDFS PRFAGTGEQKEE 
QRQVLIRQIQLAKRLNLPVNVHSRSAGRPTINLLQEQGAEKVTiL 
HAFDGRP S VAMEGVRAGYFFS I PPS 1 1 RSGQQKLVKQLPLTS I C 
LETDSPALGPEKQVRNEPWNISISAEYIAQVKGISVEEVIEVTT 
QNALKLFPKLRHLLQK 


6606 


2 




r VE I R PRAE VANLSAHSAS P I QDAVLKRLS LLEDIVYRQLNGLS 
KSLGL IEG YGGRGKGGLPATLS PAEEE KAKGPHEKYGYNS YLS E 
KISLDRSIPDYRPTKCKELKYSKDLPQISIIFIFVNEALSVILR 
S VHS AVNHTPTHLIjKE 1 1 LVDDMS DE E E LKVPLE E YVHKT? YPfiT . 
VKWRNQKREGLIRARIEGWKVATGQVTGFFDAHVEFTAGWAEP 
VLSRIQENRKRVILPSIDNIKQDNFEVQRYENSAHGYSWELWCM 
YISPPKDWWDAGDPSIiPIRTPAMIGCSFWNRKFFGEIGLLDPG 
MDVYGGENIEIiGIKVWLCGGSMEVLPCSRVAHIERKKKPYNSNI 
GFYTKRNALRVAEVWMDDYKSHVYIAWNLPLENPGIDIGDVSER 
RALRKSLKCKNFQWYLDHVYPEMRRYNNTVAYGELRNNKAKDVC 
LDQGPIiENHTAILYPCHGWGPQIJ^YTKEGFLHLGALGTTTIjLP 
DTRCLVDNSKSRLPQLLDCDKVKSSLYKRWNFIQNGAIMNKGTG 
RCLEVENRGLAGIDLILRSCTGQRWTIKNSIK 


6607 


137 


986" 


VPACAGLKKEARSLLASPPRLLNTKLQASCRALFSPPIQSRQTT 
GI S FQGRGGAGPGVPTRTQVFAAMGAVMGTFS SLQTKQRRPS KD 
KIEDELEMTMVCHRPEGLEQLEAQTNFTKRELQVLYRGFKNECP 
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SEQ 
1 ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
( A=Ala.riirie P=fVst*pi np DsAfir>flrt*i r Ao-ir? w— 

Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=j Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V*»Valine, 
W=Tryptophan, Y«Tyrosine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion/ 
\=possible nucleotide insertion) 








SG WNEDTFKQ I YAQFFPHGDASTYAHYLFNAFDTTQTGS VKFE 
D FVTALS I LLRGT VHE KLRWTFNL YD I NKDG Y I NQEEMMD I VKA 
I YDMMGKYT Y P VLKEDT PRQHVDVF FQKMDKNKDGI VTLDE FLE 
S CQEDDN I MRS LQLFQNVM 


6608 


224 


1140 


RPCFSSPTGLCPRLSYPMILLQHAVLPPPKQPSPSPPMSVATRS " 
TGTLQLPPQKPFGQEASL PLAGE E ELS KGGEQDCALEELCKPLY 
CKLCNVTLNS AQQAQAH YQGKNHG KKLRNY YAANS CP P PARMSN 

WEPAATPWPVPPnWR9 FKPRfiR VT T.ATP r NnYr , "Prr.T*nzi CTTCCD 
v vfirnni rvvr v tr rynoa r ivruwn V x Lit-i.1 CjINLJ I ^AlfcllnO C oor 

AVAQAHYQGKNHAKRLRLAEAQSNSFSESSELGQRRARKEGNEF 
KMMPNRRNMYTVQNNSGPYFNPRSRQRIPRDLAMCVTPSGQFYC 
SM CNVGAGE EME FRQHLE S KQHKS KVS EQRYRNEMENLG YV 


6609 


1 


443 


FRLRCRRFRVAGGRLAGAGLRESRVPAPEQRLSALTLLSWSAVT 
PAAEPGNFQLSPAEPRGPLASPVRAAPRAPCPAAEMSELNTKTS 

PATNOAAnOFF VfiTf afTNTW V'AF'R'TSPli'TnTnT.TII'DWfWH'Ti M Ti Tn 

GKFRRFQKRKKDPSS 


6610 


319 


881 


GR KS L CNLH I F I RFPLT YPDM YMGMMCTAKKCG I RFQP PA 1 1 L I 
YESE I KGKIRQR I MPVRNFS KFSDCTRAAEQLKNNPRHKS YLEQ 
VSLRQLEKLFSFLRGYLSGQSLAETMEQIQRETTIDPEEDLNKL 
DDKE LAKRKS I MDELFE KNQ KKKDDPN FVYD I E VEFPQDDQ LQS 


6611 


978 


212 


PGCSGAGSRVWWLPALRHLAMGSTESSEGRRVSFGVDEEERVRV 
LQGVRLSENWNRMKEPSSPPPAPTSSTFGLQDGNLRAPHKEST 
LPRSGSSGGQQPSGMKEGVKRYEQEHAAIQDKLFQVAKREREAA 
TKHSKASLPTGEGSISHEEQKSVRLARELESREAELRRRDTFYK 

FOT.FP T FDTfWAl?MVI^T C C T?f"\CUT7 7i B C V WC O r r , T'T/"OT"5TTi rn'Tii r/~" o/~»r 

QAQILHCYRDRPHEVLLCSDLVKAYQRCVSAAHKG 


6612 


1724 


992 


VSTHASALSRTQGQPQRQPRAAASGAGAGTAGGGGSGGAEGSKM 
STEAQRVDDSPSTSGGSSDGDQRESVQQEPEREQVQPKKKEGKI 
SSKTAAKLSTSAKR I QKELAE I TLDPP PNCSAGPKGDNI YEWRS 
TILGPPGSVYEGGVFFLDITFSPDYPFKPPKVTFRTRIYHCNIN 
SQGVI CLDILKDNWSPALTISKVLLS I CSLLTDCNPADPLVGS I 
ATQYMTNRAEHDRMARQWTKRYAT 


6613 


130 


748 


ELELS SNM PEQSND YRVAVFGAGGVGKS S LVLRF VKGTFRES Y I 

PTVPDT YR nVT COOV CTPTT.nTTTVTTl^CUnn'DIlMADT OTQmUT\ 
rx v EiL/ x i i\y v XoiwL/i\.£> XL, J. L»y 1 1 Ly 1 Hjori^Jtf r/iTiyKxJoXoiUjnA 

F I LVYS I TSRQS LEELKP I YEQ I CE I KGDVES I P IMLVGNKCDE 
SPSREVQSSEAEALARTWKCAFMETSAKLNHNVKELFQELLNLE 
KRRTVS LQI DGKKSKQQKRKE KLKGKCVIM 


6614 


3 


1191 


S SAAEAMR VLVRRCWGPPLAHGARRGRPS PQWRALARLGWEDCR 
DSRVREKPPWRVLFFGTDQFAREALRALHAARENKEEELIDKLE 
WTMPS PS PKGLPVKQYAVQSQLP VYEWPDVGSGE YDVGWAS F 

GVTIMQIRPKRFDVGPILKQETVPVPPKSTAKELEAVLSRLGAN 
MLISVLKNLPESLSNGRQQPMEGATYAPKISAGTSCIKWEEQTS 
EQ I FR L YRAI GN 1 1 PLQTLWMANT I KLLDLVE VNS S VLADP KLT 
GQALIPGSVIYHKQSQILLVYCKDGWIGVRSVMLKKSLTATDFY 
NG YLH P W YQ KNS QAQ P S QCRFQTLRLP TKKKQKKTVAMQQC I E 


6615 


832 


35 


GRVGAGASAMSELPGDVRAFLREHPSLRLQTDARKVRCILTGHE 
LPCRL PELQVYTRGKKYQRLVRAS PAFDYAE FEPHI VPSTKNPH 
QLFCKLTLRHINKCPEHVLRHTQGRRYQRALCKYEECQKQGVEY 
VPACL VHRRRRREDQMDGDGPRPREAF WE PTSS DEGGAASDDS M 
TDLYPPELFTRKDLGSTEDGDGTDDFLTDKEDEKAKPPREKATD 
EGRRETTVYRGLVQKRGKKQLGSLKKKFKSHHRKPKSFSSCKQS 
G 


6616 


347 


1886 


LLPPCQGARPLSSPPHASEDNLFLFWNCILCAFPHPSPQPLQYP 
VWPLLLVITQIPAPRHLRNRPFSFSRGGLDSFSGSLS TPS ICRS 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KsLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PAWVKMAPWPPKGLVPAVLWGLSLFLNLPGPIWLQPSPPPQSSP 
PPQPHPCHTCRGLVDSFNKGLERTIRDNFGGGNTAWEEENLSKY 
KDS ETRLVE VLEGVC S KS DFECHRLLE LS EEL VE S WWFHKQQEA 
PDLFQWLCSDSLKLCCPAGTFGPSCLPCPGGTERPCGGYGQCEG 
EGTRGGS GHCDCQAG YGGEACGQCGLG Y F EAERNAS HL VCS ACF 
GPCARCSGPEESNCLQCKKGWALHHLKCVDIDECGTEGANCGAD 
QFCVNT EG S YECRDCAKACLG CMG AGPGRCKKCS PG YQQ VG S KC 
LDVDECETEVCPGENKQCENTEGGYRCICAEGYKQMEGICVKEQ 
IPESAGFFSEMTEDELWLQQMFFGI 1 1 CALATLAAKGDLVFTA 
IFIGAVAAMTGYWLSERSDRVLEGFIKGR 


6617 


118 


673 


WMAWQVSLLELBDRLQCPICLEVFKESLMLQCGHSYCKGCLVS 
LS YHLDTKVRCPMCWQAVDGS S SLPNVS LAW VI EALRL PGDPE P 
KVCVHHRNPLSLFCEKDQELI CGLCGLLGSHQHHPVTP ISTVCS 
RMKEELAALFSELKQEQKKVDELIAKLVKNRTRIDGSAPSLCPC 
LGPATFTFL 


6618 


54 8 


136 


DGKVARRAPNS PAFQNDI YPL VS APRATTAES PWSKVLQNTQCR 
NVPKMT S E RS R I P CL S AAAAEGTG KKQQEG RAMATLDRKVP S PE 
AFLGKPWSSWIDAAKLHCSDNVDLEEAGKEGGKSREVMRLNKEA 
WKYGT 


6619 


246 


842 


PAS S E VLTAAVM FLLLNCI VAVS QNMG I GKNGDL PR P P LRNE FR 
YFQRMTTTSSVEGKQNLVIMGRKTWFS I PEKNRPLKDRINLVLS 
RELKEPPQGAHFLARSLDDALKLTERPELANKVDMIWIVGGSSV 
YKEAMNHLGHLKL FVTR IMQD FE S DTFFS E I DLEKYKLLPE YPG 
I LSDVQEGKHI KYKFEVCEKDD 


6620 


3 


1879 


NSRVDDFVARARMAAENEASQES ALGAYS P VDYMS I TS FPRLPE 
DEPAPAAPLRGRKDEDAFLGDPDTDPDSFLKSARLQRLPSSSSE 
MGSQDGS PLRETRKD PFSAAAAECSCRQDGLTVI VTACLTFATG 
VTVALVMQ I YFGDPQ I FQQGAWTDAARCTSLGIEVLSKQGSS V 
DAAVAAALCLGI VAPHS SGLGGGG VML VHD I RRNES HL I DFRE S 
APGALREETLQRSWETKPGLLVGVPGMVKGLHEAHQLYGRLPWS 
Q VLAFAAAVAQDG FNVTHD LARALAEQ L P PNMS ERFRET FL P S G 
RPPLPGSLLHRPDLAEVLDVLGTSGPAAFYAGGNLTLEMVAEAQ 
HAGGVITEEDFSNYSALVEKPVCGVYRGHLVLSPPPPHTGPALI 
SALNILEGFNLTSLVSREQALHWVAETLKIALALASRLGDPVYD 
STITESMDDMLSKVEAAYLRGHINDSQAAPAPLLPVYELDGAPT 
AAQVLIMGPDDFIVAMVSSLNQPFGSGLITPSGILLNSQMLDFS 
WPNRTANHSAPSLENSVQPGKRPLSFLLPTWRPAEGLCGTYLA 
LGANGAARGLSGLTQVRFTPWLAFFSREPSCGLDCRCLSYLWLV 
SIPHAANMG 


6621 


1 


662 


VQG I TS YQ QRLQALRKE KSRDAAR S RRGKENFE F YELAKLLPL P 
AAI TS QLDKAS I IRLT I S YLKMRDFANQGDPPWNLRMEGPPPNT 
S VKVIGAQ RRRS PS ALA I E VFEAHLGSH I LQ S LDG YVFALNQEG 
KFLYISETVSIYLGLSQVELTGSSVFDYVHPGDHVEMAEQLGMK 
LPPGRGLLSQGTAEDGASSASSSSQSETPEPWCFPPASDQFLL 


6622 


2 


319 


GRAS GAQE ETEAGGP ERARAMEANMPKRKEPGRS LR I KVI SMGN 
AEVGKSCI IKRYCEKRFVSKYLATIGIDYGVTKVHVRDREIKVN 
I FDMAGHPFFYEVRKPF 


6623 


1886 


189 


KAL FE KVKKFRLHVE EGD I L YAM YVRQT VL KV I KFL 1 1 IAYNSA 
L VS KVQFT VDCNVDI QDMTGYKNFS CNHTMAHL FSKLS F CYLCF 
VS I YGLTCLYTLYWLFYRSLRE YSFE YVRQETGFDDI PDVKNDF 
AFMLHMIDQYDPLYSKRFAVFLSEVSENKLKQLNLNNEWTPDKL 
RQ KLQTNAHNRLELP L I MLSGL PDTVFE I TELQS LKLE 1 1 KNVM 
I PAT I AQLDNLQELS LHQCSVKIES AALS FLKENLKVLSVKFDD 
MRE LPPWM YGLRNLEEL YLVGS LS HD I SRNVTLE SLRDLKS LKI 
LS I KSNVSKIPQAV\TOVSSHLQKMCIHNDGTKLVMLNNLKKMTN 
LTELELVHCDLERI PHAVFSLLSLQELDLKENNLKS I EEI VSFQ 
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amino acid 
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amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D^Aspartic Acid, E=* 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine / N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








HLRKLT VL KLWHNS I T Y I PEH I KKLTS L ERLS FSHNKI E VL P SH 
LFLCNKIRYLDLSYNDIRFIPPEIGVLQSLQYFSITCNKVESLP 
DELYFCKKLKTLKIGKNSLSVLSPKIGNLLFLSYLDGKGNHFEI 
LPPELGDCRALKRAGLWEDALFETLPSDVREQMKTE 


6624 


218 


1786 


GSRRGGGSRIPAVSTHVAPGRSVLRPFASGALRLRSLVKAIX5GC 
RGR PSGLAHLS QETS HWRAKRSGRACLGDF PGE ILRS F I MKCTA 
REWIiRVTTVLFMARAIPAMVVPNATIiLEKLLEKYMDEDGEWWIA 
KQRGKRAITDNDMQSILDIiHNKLRSQVYPTASNMEYMTWDVELE 
RSAESWAESCLWEHGPASLLPSIGQNLGAHWGRYRPPTFHVQSW 
YDEVKDFSYPYEHECNPYCPFRCSGPVCTHYTQWWATSNRIGC 
AINLCHNMNIWGQIWPKAVYLVCNYSPKGNWWGHAPYKHGRPCS 
ACP PS FGGGCRENLC YKEGSDR Y YP PREEETNE I ERQQS Q VHDT 
HVRTRSDDSSRNEVI S AQQMSQ I VS CEVRLRDQCKGTTCNRYEC 
PAGCLDSKAKVIGSVHYEMQSSICRAAIHYGIIDNDGGWVDITR 
QGRKHY F I KSNRNG I QT I GKYQ SANS FTVS KVT VQAVTC ETT VE 
Q LC P FHKPAS HCPRVY C PRKL YAS KSTLCS CNWNS SLF 


6625 


1124 


543 


PG PRGGGG SLLS TKALGRSRGLGMH PG PS SGGT EGGVP TALR P P 
G P L VP ST S DDNLLKN I E L FDKLALR FHGRLL FL KDVLGD E ICC W 
SFYGQGRKIAEVCCTSIVYATEKKQTKVEFPEARIFEETLNILI 
YE T PRGP DPALLEATGGAAGAGGAGRGE DE ENREHRVRR I HVRR 
H I THDER PHGQQI VF KD 


6626 


3 


1498 


SAVEFVYTDRFHJjILGISVEFLCSLRSDATMESITACLHALQAL 
LDVPWPRSKIGSDQDSGIELLNVLHRVILTRESPSIQLASLEW 
RQIICAAQEHVKEKRRSAEVDDGAAEKETLPEFGEGKDTGGLVP 
GKSLVFATLELCVCILVRQLPELNPKLTGSPGVKATKPQILLED 
GSRLVSAALVILSELPAVCS PEGS IS IL PTIL YLTIG VLRETAV 
KLPGGQLSSTVAASLQALKGILSSPMARAEKSRTAWTDLLRSAL 
TT I LDCWDP VDETHQELDE VSLLTA I T VF I LS TS PEVTT I P CLQ 
KRCIDKFKATLEI KDPWQI KTYQLLHS I FQYPNPAVS YP YI YS 
LASCIMEKLQEIDKRKPENTAELEIFQEGIKVLETLVTVAEEHH 
RAQLVACLLP I LI SFLLDENSLGSATS IMRNLHD FALQNLMQIG 
PQ YS S VFKS LVAS S PAL KARLEAAI KGNQE S VKVKI PTS K YTKS 
PGKNSSIQLKTSFL 




1 


697 


GIPHLSSRDMTGTPGAVATRDGEAPERSPPCSPSYDLTGKVMLL 
GDTGVGKTCFLIQFKDGAFLSGTFIATVGIDFRNKWTVDGVRV 
KLQ I WDTAGQERFRS VTHA Y YRDAQ ALLLLYD I TNKS S FDN I RA 
WLTEIHEYAQRDWIMLLGNKADMSSERVIRSEDGETLAREYGV 
PFLETSAKTGMNVELAFIiAIAKELKYRAGHQADEPSFQIRDYVE 
SQKKRSSCCSFM 


6628 


1 


1861 


Q CAE FGGGSGGGGGSGGGG SGGGRG AGGE ENKENE RPSAGS KAN 
KE FGDSLS LE I LQ 1 1 KES QQQHGLRHGDFQRYRG YCS RRQRRLR 
KTLNFKMdNRHKFTGKKVTE ELLTDNR YLLLVLMDAERAWS YAM 
QLKQEANTEPRKRFHLLSRLRKAVKHAEELERLCESNRVDAKTK 
LEAQAYTAYLSGMLRFEHQEWKAAI EAFNKCKTI YEKLAS AFTE 
EQAVLYNQRVEE I S PNI RYCAYNIGDQS AINELMQMRLRSGGTE 
GLLAEKLEALITQTRAKQAATMSEVEWRGRTVPVKIDKVRIFLL 
GLADNEAAIVQAESEETKERLFESMLSECRDAIQWREELKPDQ 
KQRD YI LEGE PGKVSNLQYLHS YLT YI KLSTAI KRNENMAKGLQ 
RALLQQQPEDDSKRSPRPQDLIRLYDIILQNLVELLQLPGLEED 
KAFQKEIGLKTLVFKAYRCFFIAQSYVLVKKWSEALVLYDRVLK 
YANBVNSDAGAFKNSLKDLPDVQELITQVRSEKCSLQAAAILDA 
NDAHQTE TS S S Q VKDNK PLVERFET FCLD PS L VT KQ ANL VHF P P 
GPQPIPCKPLFFDLALNHVAFPPLEDKLEQKTKSGLTGYIKGIF 
GFRS 




5653 


4549 


GATPLGSVGGRTGKMDAATLTYDTLRFAEFEDFPETSEPVWILG 
RKYS I FTE KDE I LS DVAS RL WFT YRKN FPAI GGTG PTS DTGWG C 
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Amino acid segment containing signal peptide 
(A» Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P= Phenyl alanine, Q=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , i 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X-Unknown, *»Stop | 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 








MLRCGQMIFAQALVCRHLGRDWRWTQRKRQPDSYFSVLWAFIDR 
KDSYYSIHQIAQMGVGEGKSIGQWYGPNTVAQVLKKLAVFDTWS 
S LAVH I AMDNT WMEE I RRLCRTS VP CAGATAFPADSDRHCNG F 
PAGAE VTNRPS PWRPLVLL I PLRLGLTD INEAYVETL KHC FMM P 
Q S LG VIGGKPNS AH Y F I G YVGEEL I YLDPHTTQP AVE PTDGCF I 
PDESFHCQHPPCRMSIAELDPSIAWRGGHLSTQAFGAECCLGM 
TRKTFGFLRFFFSMLG ! 


6630 


2 


423 


LVQCGGIRRRSAWGAMPGRHVSRVRALYKRVLQLHRVLPPDLKS 
LGDQ YVKD E FRRHKTVG SDE AQRFLQE WEVYATAIjLQQANENRQ 
NS TG KACFGT FL PEEKLNDFRDEQ I GQLQE LMQEATKPNRQ FS I 
SESMKPKF 


6531 


2 


423 


LVQCGGIRRRSAWGAWPGRHVSRVRALYKRVLQLHRVLPPDLKS 
LGDQ YVKDE FRRHKTVGS DEAQR FLQ E WEVYATALLQQANENRQ 
NS TGKACFGT FL PE EKLND FRDEQ I GQLQE LMQE ATKPNRQ FS I 
SESMKPKF 


6632 


1273 


588 


WNS RGRTQ RGAAPLAPAAAMKAWQRVTRAS VT VGGEQ I S A I GR 
GICVLLGISLEDTQKELEHMVRKILNLRVFEDESGKHWSKSVMD 
KQYE I LCVSQFTLQCVLKGNKPDFHLAMPTEQAEGFYNS FLEQL 
RKTYRPELIKDGKFGAYMQVHIQNDGPVTIELESPAPGTATSDP 
KQLSKLEKQQQRKEKTRAKGPSESSKERNTPRKEDRSASSGAEG 
DVSSEREP 


ODjj 


1145 


617 


ATGRHEGVPTLEGI IQQLVNGI ITPATI PSLGPWGVIsHSNPMDY 
AWGANGLDAI I TQLLNQFENTGPPPADKEKIQALPTVPVTEEHV 
GSGLECPVCKDDYALGERVRQLPCNHLFHDGCIVPWLEQHDSCP 
VCRKSLTGQNTATNPPGLTGVSFSSSSSSSSSSSPSNENATSNS 


6634 


1 


1134 


CGG I PRKGSGPRRR LPMARIiRDCLPRLMLTLRS LLF WS L VYC YC 
GLCAS I HLLKLLWS LGKG P AQTFRRPAREHP PACLS DPS IjGTHC 
YVRIKDSGLRFHYVAAGERGKPLMLLLHGFPEFWYSWRYQLREF 
KS E YR WALDLRG YGETDAP I HRQN YKLDCL I TD I KDILDS LG Y 
SKCVL IGHDWGGMIAWLIAI CYPEM VMKLI VINFPHPNVFTE YI 
LRHPAQLLKSS YYYFFQI PWFPEFMFS INDFKVLKHLFTSHSTG 
IGRKGCQLTTEDLEAYIYVFSQPGALSGPINHYRNIFSCLPIiKH 
HMVTTPTLLLWG ENDAFM E VEMAEVTR FYVKN Y FRLTIL S EASH 
WLQQDQPDIVNKLIWTFLKEETRKKD 


6635 • 


1420 


470 


EMRAGQQLASMLRWTRAWRLPREGLGPHGPSFARVPVAPS S SSG 
GRGGAE PRPLPLS YRLLDGEAALPAWFLHGLFGS KTNFNS I AK 
ILAQQTGRRVLTVDARNHGDS PHS PDMS YEIMSQDLQDLL PQLG 
LVPCVWGHSMGGKTAMLLAIiQRPELVERLI AVD I S PVESTGVS 
HFATYVAAMRAINIADELPRSRARKLADEQLSSVIQDMAVRQHL 
LTNLVEVDGRFVWRVNLDALTQHLDKILAFPQRQESYLGPTLFL 
LGGNS Q F VHPS HHP E I MRL FPRAQMQT VPNAGHW I HADR PQDF I 
AAIRGFLV 


6636 


1514 


1801 


SFCMFSHKQDSHFQAVPVQEKKKRLRRAPWRAFAQPQRLKHPAE 

OP T VR OPT .O R P P T iCOVT /^PVOOflT »P P ^ T ,f3PVT . ^ PU C FIDHW PT? \TT\ 

DGGDGVF 


6637 


2 


1501 


CSSS P CFHDGT C VLDKAGS YKCACIiAG YTGQRCENLLE AGKS K I 
KASEDSLSVLEERNCSDPGGPVNGYQKITGGPGLINGRHAKIGT 
WSFFCNNSYVLSGNEKRTCQQNGEWSGKQPICIKACREPKISD 
LVRRRVLPMQVQSRETPLKQLYSAAPSKQKLQSAPTKKPALPFG 
DLPMGYQHLHTQLQYECIS P FYRRLGS SRRTCLRTGKWSGRAPS 
C I P I CG KI ENI TAP KTQGLRW P WQAAI YRRTSG VHDGS LHKGAW 
FLVCSGALVNERTVWAAKCVTDLGKVTMIKTADLKVVLGKFYR 
DDDRDE KT I QS LQ I S AI I LHPNYD P I LLDAD I AI L KLLDKAR I S 
TRVQP I CLAASRDLSTS FQESH I TVAGWNVLADVRS PGFKNDTL 
RSGWS WDSLLCEEQHEDHG I PVS VTDNMFCAS WEPTAPSDI C 
TAETGGIAAVSFPGRASPEPRWHLMGLVSWSYDKTCSHRLSTAF 
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Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, 1=1 so leucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S'=Serine, T=Threonine, VaValine, 
W=Tryptophan, Y^Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TKVLPFKDWIERNMK 


6638 


1391 


224 


GGIPQAGGKMAAPWWRAALCECRRWRGFSTSAVLGRRTPPLGPM 
PNS D I DLSNLERLEKY RS FDRYRRRAEQEAQAPHWWRT YRE YFG 
EKTDPKEKIDIGLPPPKVSRTQQLLERKQAIQELRANVEEERAA 
RLRTASVPLDAVRAEWERTCGPYHKQRLAEYYGLYRDLFHGATF 
VPRVPLHVAYAVGEDDLM P VYCX3NEVTPTEAAQ APE VTYEAEEG 
SLWTLLLTSLDGHLLEPDAEYLHWLLTNIPGNRVAEGQVTCPYL 
PPFPARGSGIHRLAFLLFKQDQPIDFSEDARPSPCYQLAQRTFR 
T FD F YKKHQETMT PAG LS F FQCRWDD S VTY I FHQLLDMRE P VFE 
FVRPPPYHPKQKRFPHRQPLRYLDRYRDSHEPTYGIY 


6639 


2046 


1268 


IGC F I MDGGDDGNL 1 1 KKR FVS EAELDERRKRRQEEWEKVRKPE 
D PE E C PEEV YD PRS LYBRLQEQKDRKQQE YE EQ FKFKNM VRGLD 
EDETNFLDEVSRQQELIEKQRREBELKELKEYRNNLKKVGISQE 
NKKEVEKKLTVKPIETKNKFSQAKLIiAGAVKHKSSESGNSVKRL 
KPDPEPDDKNQEPSSCKSLGNTSLSGPSIHCPSAAVCIGILPGL 
GAYSGSSDSESSSDSEGTINATGKIVSSIFRTNTFLEAP 


6640 


117 


1043 


VLE P PD VS MAE S EDRS LR I VL VGKTG SGKSATANT I LGE E I FDS 
RIAAQAVTKNCQKASREWQGRDLLWDTPGLFDTKESLDTTCKE 
ISRCIISSCPGPHAIVLVLLLGRYTEEEQKTVALIKAVFGKSAM 
KHMVILFTRKEELEGQSFHDFIADADVGLKSIVKECGNRCCAFS 
NSKKTS KAEKE SQVQELVELI EKMVQCNEGAYFSDDI YKDTEER 
LKQREEVLRKI YTDQLNEE I KLVEEDKHKSEEKKEKEI KLLKLK 
YDEKIKNIREEAERNIFKDVFNRIWKMLSEIWHRFLSKCKFYSS 


6641 


1 


894 


SAAVGRRSEVRGCAPRPRLRRSARRMDPVPGTDSAPLAGLAWSS 
ASAP P PRGFS AI S CTVEGAP ASFGKS FAQKS G Y FLCLS S LG S LE 
NPQENWAD I Q I WDKS PLPLGFS P VCDPMDSKAS VSKKKRMCV 
KLIiPLGATDTAVFDVRLSGKTKTVPGYLRIGDMGGFAIWCKKAK 
APRPVPKPRGLSRDMQGLSLDAASQPSKGGLLERTASRLGSRAS 
TLRRNDSIYEASSLYGISAMDGVPFTLHPRFEGKSCSPIAFSAF 
GDLTIKSLADIEEEYNYGFVVEKTAAARLPPSVS 


6^42 


22 


1296 


PLEERMMTKMDPNDQAQRDIIFELRRIAFDAESDPSNAPGSGTE 
KRKAMYTKDYKMLGFTNHINPAMDFTQTPPGMIALDNMLYLAKV 
HQDTYIRIVLENSSREDKHECPFGRSAIELTKMLCEILQVGELP 
NEGRNDYHPMFFTHDRAFEELFGICIQLLNKTWKEMRATAEDFN 
KVMQ WREQ I TRALPS KPNS LDQFKS KLRSL S YS E I LRLRQS ER 
MSQDDFQSPPIVELREKIQPEILELIKQQRLNRLCEGSSFRKIG 
NRRRQERFWYCRLAIiNHKVLHYGDLDDNPQGEVTFESLQEKIPV 
ADI KAI VTGKDCPHMKEKSALKQNKE VLELAFS I LYDPDETLNF 
IAPNKYEYCIWIDGLSALLGKDMSSELTKSDLDTLLSMEMKLRL 
LDLEN IQI PEAPP P I PKE P S S YDFVYHYG 


6643 


3049 


2265 


S LHAPAEGRTRGRLAE KP KMLTRKI KLW D I NAH I TCRL CSG Y LI 
DATT VTECLHTFQ* S CLVKYLEENNT C PTCR I VI HQSH P LQ Y I G 
HDRTMQDIVYKLVPGIiQEAEMRKQREFYHKLGMEVPGDIKGETC 
S AKQHLDSHRNGET KADDS S NKEAAE E KP EEDND YHRS D EQ VS I 
CLECNS S KLRGLKRKW IRCS AQATVLHL KKFI AKKIiNLSS FNEL 
DILCNEEILGKDHTLKFVWTRWRFKKAPLLLHYRPKMDLL 


6644 


1489 


290 


FRPLATEPRGSSPVQLVSSTMSVRTLPLLFLNLGGEMLYILDQR 
LRAQNIPGDKARKVLNDIISTMFNRKFMEELFKPQELYSKKALR 
TVYERLAHAS IMKLNQASMDKLYDLMTMAFKYQVLLCPRPKDVL 
LVTFNHLDT I KGF IRDS PT I LQQVDETLRQLTE I YGGLS AGE FQ 
LIRQTLLIFFQDLHIRVSMFLKDKVQNNNGRFVLPVSGPVPWGT 
EVPGLIRMFNNKGEEVKRIEFKHGGNYVPAPKEGSFEFYGDRVL 
KIX3TNMYSVNQPVETHVSGSSKNLASWTQESIAPWPLAKEELNF 
LARLMGGMEIKKPSGPEPGFRLNLFTTDEEEEQAALTRPEELSY 
EVINIQATQDQQRSEELARIMGEFEITEQPRLSTSKGDDLLAMM 
DEL 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine / D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6645 


6530 


4646 


FVEGLAGYVYKAASEGKVLTLAALLLNRSESDIRYLLGYVSQQG 
GQRSTPLI IAARNGHAKVVRLLLEH YRVQTQQTGTVRFDGYV ID 
GATALWCAAGAGHFEVVXLLVSHGANVIWTTVTNSTPLRAACFD 
GRLD I VKYL VENNAN I S I ANKYDNTCLM I AAYKGHTD WR YLLE 
QRAD PNAKAHCGATALHFAAE AGH I D I VKEL I KWRAAI WNGHG 
MTPLKVAAES CKADWELLLSHADCDRRS R I EALELLGAS FAND 
RENYD 1 1 KT YH YL YLAMLER FQDGDN I LE KE VLP P I HAYGNRTE 
CRNPQELES I RQDRDALHMEGLI VRER ILGADNIDVSHP I I YRG 
AVYADNMEFEQCIKLWLHALHLRQKGNRNTHKDLLRFAQVFSQM 
IHLNETVKAPD I ECVLRCS VLE IEQSMNRVKNI SDADVHNANDN 
YECNLYTFLYLVCISTKTQCSEEDQCKINKQIYNLIHLDPRTRE 
GFTLLHIiAVNSNTPVDDFHTNDVCSFPNALVTKLLLDCGAEVNA 
VDNEGNS ALHI I VQYNRP I SDFLTLHS 1 1 1 S LVE AGAHTDMTN K 
QNKT PLD KSTTG VS E I LL KTQM KMS LKCLAARAVRAND I N YQDQ 
I PRTLEEFVGFH 


6646 


176 


890 


PSSRMNHLPEDMENALTGSQSSHASLRNIHSINPTQLMARIESY 
EGREKKGISDVRRTFCLFVTFDLLFVTLLWIIELNVNGGIENTL 
EKEVMQYDYYSSYFDIFLLAVFRFKVLILAYAVCRLRHWWAIAL 
TTAVTSAFLLAKVILSiCLFSQGAFGYVLPIISFILAWIETWFLD 
FKVLPQEAEEENRLLIVQDASERAALIPGGLSDGQFYSPPESEA 
GSEEAEEKQDSEKPLLEL 


6647 


176 


890 


PSSRMNHLPEDMENALTGSQSSHASLRNIHSINPTQLMARIESY 
EGR E KKG I S DVRRT FCLFVTFDLL FVTLLWI I ELNVNGG I ENTL 
EKEVMQYDYYSSYFDIFLLAVFRFKVLILAYAVCRLRHWWAIAL 
TTAVTSAFLLAKVILSKLFSQGAFGYVLPIISFILAWIETWFLD 
FKVLPQEAEEENRLLIVQDASERAALIPGGLSDGQFYSPPESEA 
GSEEAEEKQDSEKPLLEL 


6648 


413 


897 


RNCWNCFTKYFNS PPEDIDHKDS YLI TRS IMAEPDYIEDDNPEL 
IRPQKLINPYKTSRNHQDLHRELLMNQKRGLAPQNKPELQKVME 
KR KRDQ VI KQ KE E EAQKKKS DLE I ELL KRQQ KLEQLELE KQ KLQ 
EEQENAPEFVKVKGNLRRTGQEVAQAQES 


6649 


1357 


832 


WI PRAAG I RHE VKWD VKE I MSQHN I YVDALLKE FEQFNRRLNE V 
SKRVR I PLP VSNI LWEHC I RLANRTI VEGYANVKKCSNEGRALM 
Q LD FQQFLM KLE KLTD I RP I PD KE FVETY I KAY YLTENDM ERW I 
KEHREYSTKQLTNLVNVCLGSHINKKARQKLLAAIDDIDRPKR 


6650 


32 


765 


LVPLVFSLLVQS CKQVYRS I AMKFVPCLLLVTLS CLGTLGQAPR 
QKQG S TGEE FHFQTGGRDS CTMRP SS LGQGAGE VWLRVD CRNTD 
QT Y WCE YRGQ P S MCQAFAADP KS YWNQALQE LRRLHHACQGAP V 
LRPSVCREAGPQAHMQQVTSSLKGSPEPNQQPEAGTPSLRPKAT 
VKLTEATQLGKDSMEELGKAKPTTRPTAKPTQPGPRPGGNEEAK 
KKAWEHCWKPFQALCAFLISFFRG 


6451 


3425 


1353 


AKELLKVGDFSLCAGP YQNTADTMENLS KEPLASF VSESFDI SA 
CGIATEHVKIDNSGEGLTAEAGSETLSRDGEVGVNSDMHYELSG 
DSDLDLLGD CRNPRLDLEDS YTLRGS YTRKKDVPTDGYES S LN F 
HNI^QEDWGCSS WVPGMETSLPPGHWTAAVKKEEKCVP PYVQ I R 
DLHGILRTYANFSITKELKDTMRTSHGLRRHPSFSANCGLPSSW 
TSTWQVADDLTQNTLDLE YLRFAHKLKQTI KNGDS QHSAS S ANV 
FPKES PTQIS IGAFPSTKISEAPFLHPAPRSRS PLLVTWESDP 
RPQGQPRRGYTASSLDSSSSWRERCSHNRDLRNSQRNHTVSFHL 
NKLKYNSTVKESRNDI SL I LNE YAEFNKVMKNSNQF I FQDKELN 
DVSGEATAQEMYLPFPGRSASYEDI I IDVCTNLHVKLRSVVKEA 
CKSTFLFYLVETEDKSFFVRTKNLLRKGGHTEIEPQHFCQAFHR 
ENDTLII I IRNEDISSHLHQIPSLLKLKHFPSVIFAGVDSPGDV 
LDHT YQE LFRAGGFV I SDD K I LEAVTLVQLKE 1 1 KI LEKLNGNG 
RWKWLLHYRENKKLKEDERVDSTAHKKNIMLKS FQSANI I ELLH 
YHQCDSRSSTKAEILKCLLNLQIQHIDARFAVLLTDKPTIPREV 
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to first 
amino acid 
residue of 
amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, lUArginine, 
SsSerine, T« Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /"possible nucleotide deletion, 
\=possible nucleotide insertion) 








FENNG I L VTD VNNF I EN IEK I AAP FRS S YW 


6652 


2 


1343 


IPGSTISCSCHSRRLRGGSPAPRLSLGAASPRPRPPSLPLPLPL 
PFPLFLPTRPAERAW I RSRRASEWVGKMEVPRLDHALNS PTS PC 
EEVIKNLSLEAIQLCDRDGNKSQDSGIAEMEELPVPHNIKISNI 
TCDS FKI S WEMDS KS KDRI TH YFI DLNKKENKNSNKFKHKDVPT 
KLVAKAVPLPMTVRGHWFLS PRTE YTVAVQTASKQVDGD YWS E 
WS E I IE FCTAD YS KVHLTQLL E KAE VIAGRML KFS VF YRNQHKE 
YFDYVREHHGNAMQPSVKDNSGSHGSPISGKLEGIFFSCSTEFN 
TGKPPQDSPYGRYRFEIAAEKLFNPNTNLYFGDFYCMYTAYHYV 
ILVIAPVGSPGDEFCKQRLPQLNSKDNKFLTCTEEDGVLVYHHA 
QDVILEVIYTDPVDLSLGTVAEITGHQLMSLSTANAKKDPSCKT 
CNISVGR 


6653 


170 


1910 


FFLEPRLRPFPASRARFVPARTRPSPLHPCCFCFEGGGSMLSPQ 
RVAAAAS RGADD AMES S KPGP VQWL VQKDQH S FELDEKALAS I 
LLQDHI RDLD WWS VAGAFRKGKS FILD FMLRYLYSQKE SGHS 
NWLGDPEEPLTGFSWRGGSDPETTGIQIWSEVFTVEKPGGKKVA 
WLMDTQGAFDSQS TVKDCAT I FALS TMTS SVQIYNLS QN I QED 
DLQQLQLFTEYGRLAMDEIFQKPFQTLMFLVRDWSFPYEYSYGL 
QGGMAFLDKRLQVKEHQHEEIQNVRNHIHSCFSDVTCFLLPHPG 
LQVATSPDFDGKLKDIAGEFKEQLQALIPYVLNPSKLMEKEING 
SKVTCRGLLEYFKAYIKIYQGEDIiPHPKSMLQATAEAYNIiAAAA 
SAKDIYYNNMEEVCGGEKPYLSPDILEEKHCEFKQLALDHFKKT 
KKMGGKDFSFRYQQELEEEIKELYENFCKHNGSKNVFSTFRTPA 
VLFTGIVALYIASGLTGFIGLEWAQLFNCMVGLLLIALLTWGY 
I RYSGQYRELGGAIDFGAAYVLEQAS SHIGNS TQATVRDAWGR 
PSMDKKAQ 


6654 


1 


705 


RTSLS PSQCSS FNLAMAS AGMQ ILGWLTLLGW VNGLVS CALPM 
WKVTAFIGNS I WAQWWEGLWMSCWQSTGQMQCKVYDS LLAL 
PQDLQAARALCVI ALLVALFGLLVYLAGAKCTTCVEE KDS KARL 
VLTSG I VF V I S GVLTL I P VCWTAHAV IRDFYNP LVAEAQ KRELG 
AS LYLGWAASGLLLLGGGLLCCTCPS GGSQGPSHYMAR YS TS AP 
AISRGPSEYPTKNYV 


6655 


341 


16 


KDA YM FKKGLLAIALVFS L P VFAAEHW I DVRVPEQ YQQEHVQGA 
INI PLKE VKE R I ATAV PDKNDTVKVY CNAGRQ SGQAKE I LS EMG 
YTHVENAGGLKD IAMPKVKG 


6656 


2 


1212 


TELPPRPANLAIQPPLSPLRALAPLPEKPGAVPJ'PQKRMAKVAK 
DLNPGVKECMSLGQLQSARGVACLGCKGTCSGFEPHSWRKICKSC 
KCSQEDHCLTSDLEDDRKIGRLLMDSKYSTLTARVKGGDGIRIY 
KRNRM I MTNP XATGKDPTFDTITYEWAP PGVTQKLGLQYMEL I P 
KEKQPVTGTEGAFYRRRQLMHQLPIYDQDPSRCRGLLENELKLM 
EEFVKQ YKSEALGVGEVALPGQGGLP KE EGKQQEKPEGAETTAA 
TTNGSLS DPS KEVEYVCELCKGAAPPDSPWYS DRAG YNKQWHP 
TCF VCAKCS E PLVD L I Y FWKDGAP WCGRH YCES LRPR CS GCDE I 
I FAEDYQRVEDLAWHRKHFVCEGCEQLLSGRAY I VTKGQLLCPT 
CSKSKRS 


6657 


830 


2120 


LLTCQERAGDCLLSASTMKEWYWSPKKVADWLLENAMPEYCEP 
LEHFTGQDL INLTQEDFKKPPLCRVS SDNGQRLLDM I ETLKMEH 
HLEAHKNGHANGHLNIGVDIPTPDGS FS I KIKPNGMPNGYRKEM 
IKIPMPELERSQYPMEWGKTFLAFLYALSCFVLTTVMISVVHER 
VPPKEVQPPLPDTFFDHFNRVQWAFS I CEINGMILVGLWLI QWL 
LLKYKS I ISRRFFCIVGTLYLYRCITMYVTTLPVPGMHFNCSPK 
L FGDWEAQLRR I MKL I AGGGLS ITGSHNMCGDYL YSGHTVMLTL ■ 
TYLFI KEYS PRRLWWYHWI CWLLS WG I FC I LLAHDH YTVD VW 
AYYITTRLFWWYHTMANQQVLKEASQMNLLARVWWYRPFQYFEK 
NVQG I VPRS YHWPFPW PWHLSRQVKYSRLVNDT 


6658 


35 


855 


HCCALGAPGS PYRGLY FSSAAPCTAPRKAKHQSTLEGLTKRMLM 
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Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D~Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, OGlutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=poseible nucleotide insertion) 








FDPVPVKQEAMDPVSVSYPSNYMESMKPNKYGVIYSTPLPEKFF 
QTPEGLSHGIQMBPVDLTVNKRSSPPSAGNSPSSLKFPSSHRRA 
S PGLSMPSS S PP I KKYS PPS PGVQPFGVPLS MPPVMAAALSRHG 
IRSPGI LPV I QPVWQP VPFM YTSHLQQPLMVSLS EEMENS S S S 
MQVPVIESYEKPISQKKIKIEPGIEPQRTDYYPEEMSPPLMNSV 
SPPQALLQE 


6659 


18 


523 


EPQRGDCETWFQNCSLPKFVCFFCWGFWLWRAHSMSNLHSLPGL ' 
RGLTS ISRNQLQCTNAMRVINNYQRRWKNQNTFLLATFANWNV 
CGN PT I T CPHNRTLNNCHHS G VQ VPLM YCNLTTP S PQN I SNCRY 
AQTPANM FY I VACDl^RDQRRDPPQ YP WPVHLHT 1 1 


6660 


514 


1707 


CAASLDCRHHLCEPDMKLVWPSAKLLQAAAGASARACDSVTSNV 
LPLLLEQFHKHSQSSQRRTILEMLLGFLKLQQKWSYEDKDQRPL 
NG FKDQLCS LVFMALTD PS TQLQLVG I RTLTVLGAQP DLLS YE D 
LELAVGHLYRLSFLKEDSQSCRVAALEASGTLAALYPVAFSSHL 
VP KLAE B LR VGESNLTNGDE P TQC S RHL CCLQALSAVS THP S I V 
KETLPLLLQHLWQVNRGNMVAQSSDVIAVCQSLRQMAEKCQQDP 
ESCW YFHQTAI PCLL7VLAVQASMP EKE PS VLRKVLLEDE VLAAM 
VS VI GTATTHLS P ELAAQS VTH I V P LFLDGNVS FLPENS FPS R F 
QP FQDGS S GQRRL I ALLMAFVCS L PRNVSEH I WEVLL FNLDKVT 
PG 


6661 


179 


430 


GVHAASGTLSATWLAEAKMFDSLAKAGKYLGQAAPCLMIGMPDYD 
NYVEHMRVNHPDQTPMTYEEFFRERQDARYGGKGGARCC 


6662 


185 


423 


RSLP KPAPAQ PAS I HCARFSGVT P PTAKTAMS DGNTAFNALMYC 
GPKADDGNI FSACAPASSAVKASVSVAQPGQAVIP 


6663 


3 


1005 


RP VLSSRVDDFVPPLPETSGRRKKLERMYS VDRVSDD I P I RTW F 
PKENLFS FQTAS TTMQAI SNFR KHLRMVG S RR VKAQTFAERRER 
SFSRSWSDPTPMKADTSHDSRDSSDLQSSHCTLDEAFEDLDWDT 
EKGLEAVACDTEGFVPPKVMLIS S KVPKAEYI PTI IRRDDPS 1 1 
P I LYDHEHATFED ILEE IERKLNVYHKGAKI WKMLIFCQGGPGH 
LYLLKNKVATFAKVEKEEDMIHFWKRLS RLMS KVNPEPNVIHIM 
GCYILGNPNGEKLFQNLRTLMTPYRVTFESPLELSAQGKQMIET 
YFDFRLYRLWKSRQHS KLLDFDDVL 


6664 


58 


968 


PRLLRLPRS VWMDS P WDE LALAFS RTSM F P FFDIAHYLVS VMA 
VKRQ PG AAALAWKNP I S S WFTAMLHC FGGG I LS CLLLAE P PLKF 
LANHTN ILLAS S I WY I T FFCPHDLVS QG YS YLP VQLLASGM KEV 
TRTWKI VGGVTHANS YYKNGW I VM I AIG WARGAGGTI I TNFERL 
VKGDWKPEGDEWLECMS YPAKVTLLGS VT FTFQHTQHLAIS KHNL 
MFLYTI FI VATKI TMMTTQTSTMTFAPFEDTLS WMLFGWQQPFS 
SCEKKSEAKSPSNGVGSLASKPVDVASDNVKKKHTKKNE 


6665 


171 


1278 


D ERRLACRQ WTQQR S EL Y PG FQ KRQRFLPKAGEE AAAQGGRHL 
PGRWLG PG CTQNPCS VHTATG PE PRKLPLLP P DS PNSG YP KE PA 
ALCPGI PS PCRMTHQDLS ITAKL INGGVAGLVG VTCVFP I DLAK 
TRLQNQHG KAM YKGM I DCLMKTARAEGFFGM YRGAAVNLTLVTP 
EKAIKLAANDFFRRLLMEDGMQRNLKMEMLAGCGAGMCQVWTC 
ir'MHMJbK.lUijUUACsRIiAVHHQGSASAPSTSRSYT 
ATLIAWELLRTQGLAGLYRGLGATLLRDIPFSIIYFPLFANLNN 
LGFNELAGKAS FAHS FVSGCVAGS I AAVAVTPLDVLKTRI QTLK 
KGLGEDMYSGITDCAR 


6666 " 


498 


286B 


MTTFLPVPQMMAGFSFGTFGNPPMESPSAWQTIHQPFIVSCLTL 
WSPGCWPQPIQKEGVGLWD I RKPQSSLLRYGGNLSLQSAMSVRF 
NSNGTQLLALRRRLPPVLYDIHSRLPVFQFDNQVYFNSCTMKSC 
CFAGDRDQYILSGSDDFNLYMWR I PADPEAGGI GRWNGAFMVL 
KGHRS I VNQVRFNPHTYM I CS SGVEKI I KI WS P YKQPGCTGDLD 
GRIEDDSRCLYTHEEYISLVLNSGSGLSHDYANQSVQEDPRMMA 
FFDS L VRRE IEG WS S DSDS DL S E ST I LQLHAGVS ERS GYTDS ES 
SASLPRSPPPTVDESADNAFHLGPLRVTTTNTVASTPPTPTCED 
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c o r r e spon d i ng 
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Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=»Phenyl alanine, G=Glycine, 
HoHietidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 

W=TrVDtODhan YsTvrosine X-UnVnnum *_qt- nn 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AASRQQRLSALRRYQDKRLLALSNESDSEENVCEVELDTDLFPR 
PRS PS PEDESSS SS SSSS S EDEEELNERRAS TWQRNAMRRRQKT 
TREDKPSAPIKPTNTYIGEDNYDYPQIKVDDLSSSPTSSPERST 
STLEIQPSRASPTSDIESVERKIYKAYKWLRYSYISYSNNKDGE 
TSLVTGEADEGRAGTSHKDNPAPSSSKEACLNIAMAQRNQDLPP 
EGCSKDTFKEETPRTPSNGPGHEHSSHAWAEVPEGTSQDTGNSG 
S VEHP FETKKLNG KALS S R AE P P P <5 P P VP V a Q a<3 tt .w <za cr-Mf tj 

RTQ SDDSEERSLETI CANHNNGRLHPRP PHPHNNGQNLG ELEVV 
AYSSPGHSDTDRDNSSLTGTLLHKDCCGSEMACETPNAGTREDP 
TDT P ATDS S RAVHGH SGLKROR I E LE DTD PTJ <3 <3 <3 P K"K7 , YT 


6667 


171 


1310 


ABEVERLAAMRS DSL VPGTHTP P IRRRSKFANLGRI PKP WKWRX 
KKSEKFKHTSAALERKISMRQSREELrKRGVLKEIYDKDGELSI 
SNEEDSLENGQSLSSSQLSLPALSEMEPVPMPRDPCSYEVLQPS 
DIMDGPDPGAPVKLPCLPVKLSPPLPPKKVMICMPVGGPDLSLV 
SYTAQKSGQQGVAQHHHTVLPSQIQHQIiQYGSHGQHLPSTTGSL 
PMHPSGCRMIDF!TiNKTriAMTMnT?TiP<5c;pnPUPrQTQVWCCr»T uc 

GDG VTKAG PMGL PEIRQ VP TWI BCDDNKENVPHESD YE DS S CL 
YTREEEEEEEDEDDDSSLYTSSLAMKVCRKDSLAIKPSNRPSKR 
ELEEKNILPRQTDEERLELRQQIGTKL 


6668 


[ 714 


358 


iwnvniur «xj J- UA.\_n v ^» isooWLlUlo V V v-Jtr/^IaorCir i. 1 I\J J. VDr 

LRGNLVKKDCAE S CTPS YTLQGQVS SG TS STQ CCQEDL CNE KLH 
NAAP TRTALAH SAL S LGLAL S LLAV I LAPSL 




4 59 


1207 


KDEETRKDYDYMLDHPEBYYSHYYHYYSRRLAPKVDVRWILVS 
VCAISVFQFFSWWNSYNKAISYLATVPKYRIQATEIAKQQGLLK 
KAKEKGKNKKS KEE IRDEEEN I IKNI I KSKIDI KGGYQKPQ ICD 
LLLFOIILAPFHLOQYTVWYCRWTYTJPNTtfnVPVnPPTTUT VTTD 

KSMKMS KS QFDS LEDHQKETFLKRE LW I KENYEVYKQEQE EEL K 
KKLAND P R WKR YRRWMKNEG PGRLTF VDD 


6670 


184 


594 


VARI*GEAAKMSSEPPPPYPGGPTAPLLEEKSGAPPTPGRSSPA 
VMQPPPGMPLPPADIGPPPYEPPGHPMPQPGFI PPHMSADGTYM • 
PPGFYPP PGPHP PMGYYP PGP YTPGP YPG PGGHTATVLVPSGAA 
TTVTV 


6671 


1 


763 


LPAEKPRSAPNMAGGRCXSPQLTAlxLAAWIAAVAATAGPEEAALP 
PEQSRVQPMTASNWTLVMEGEWMLKFYAPWCPSCQQTDSEWEAF 
AKNGE I LQI SVGKVDVIQE PGLSGRFFVTTLPAFFHAKDG I FRR 
YRGPG I FEDLQNY I LE KKWQS VE PLTG WKS PAS LTMSGMAGLF S 
I SGKI WHLHNYFTVTLGI PAWCS YVFFVI ATLVFGLSMDLVL* V 
ISQCNWDPPYRHVS * /RPSTNLGVHTAHTSEHLRL 


6672 


304 


1089 


APGSKP VQFMDFEGKTS FGMS VFNLSNAI MGSGILGLAYAMAHT 
GVI FFLALLLCI ALLS S YS IHLLLTCAG I AG I RAYEQLGQRAFG 
PAGKVWATVICLHNVGAMS S YLF I IKSELPLVIGTFL YMDPEG 
DWFLKGNLL 1 1 1 VS VL 1 1 LPLALMKHLG YLGYTSGLS LTCMLFF 
LVS VI YKKFQLGLCYRATMKQQWES EALVGTPQPRDS TAAVKAQ 
MFHS *LTG VLTQWP I MAFAF VCHPGGAG PS ITELCRAFQAQD 


6673 


1116 


1963 


LQIQTIiHTHHGARVTHLGSHQLLANAGTMLCRQQSSSMAPAFSQ 
S VTCGPS PCVRKQES ATKCLH IGACGSDLWARGWEQG* G* GLNV 
WLC P CVAFHRGAR PQAEEGGARWNS LVS SPWI PPNP * HSS IGAE 
NAVPRP*QG*KVNPSGQERQS\WVLPLPVPGEPLKLPGLPG*NK 
SFSRV/SGSKGKWILPRQLM*AS*R\TPRFVPGTQWVPITW/PL 
ITWH*SAPTPPLKACPAPRESDPCSSCLSCPCVTQHPRFSDTGW 
FGAGHCHSS CDFTRKGAAGGPG 


6674 


1 


440 


LE FD YMCQ YDYVE VRDGDNRDGQI I KR VCGNER PAP IQS IGSS L 
HVL FHSDG S KNFDGFHAI YEE I TAC S SS PCFHDGTCVLDKAGS Y 
KCACLAGYTGQRCENLLEERNCSDPG/WPSQWVPENNRGPWAYQ 
PTPC* IGTRVAFFLT 



538 



WO 01/53312 



PCT/US00/34263 



SBQ 
ID 
NO: 


Predicted 
beginning 
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Amino acid segment containing signal peptide 
(A=Alanine,. C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ToThreonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 


6675 


277 


1678 


GNW P T ERMAFLDN PT 1 1 IiAHIRQSHVTSDDTGM C EMVL I DHD VD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQ I KCKNIQWKERNSKQSAQELKSLFE 
KKSLKEKPPISGKQSILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYIjPLHSSQDRLLPMTVVTMASARVQDLIGLICWQ 
VTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 
G FS TLALVEKYS 3 PGLTS KESL FVR I NAAKGFS L IQ VDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH 
H YKS F KVS MI HRLRFTTDVQL/ GCAL FPG VLRKRAAPVDCLRPS 
ADTWRQEQ IG CCGAACAALRS * DSHKC * EG I S GD KVE I D P VTNQ 
KASTKFWIKQKP I SIDSDLLCAC\DLAEE 


6676 


277 


1678 


GNWPTERMAFIJ)NPTI IIjAHIRQSHVTSDDTGMCEMVLIDHDVD 

LBKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 

RRSNTAQRIiERLRKERQNQIKCKNIQWKERNSKQSAQELKSLFE 

KKSLKEKPPISGKQSILSVRLEQCPLQLNNPFNEYSKFDGKGHV 

GTTATKKIDVYLPLHSSQDRLLPMTWTMASARVQDLIGLICWQ 

YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 

GFSTLALVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 

KEILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH. 

HYKS FKVSMIHRLRFTTDVQL/ GCALFPGVLRKRAAPVDCLRPS 

ADTWRQEQIGCCGAACAALRS *DSHKC*EGI SGDKVEIDPVTNQ 

KASTKFW I KQKP I S IDSDLLCAC\DLAEE 


6677 


277 


1678 


GNWPTERMAFLDNPTIILAHIRQSHVTSDDTGMCEMVLIDHDVD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQ I KCKNIQWKERNS KQS AQELKSLFE 
KKS LKE KP P I SGKQS I LS VRLEQCP LQLNNP FNE YSKFDG KGHV 
GTTATKKIDVYLPLHSSQDRLLPMTVVTMASARVQDLIGLICWQ 
YTSEGREPKLNDNVSAYCLHI AEDDGE VDTDFPPLDSNE P IHKF 
GFSTLALVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KE I LLKAVKRRXGS QKVSGSRADG VFEEDSQ ID IATVQDMItS SH 
HYKSFKVSMIHRLRFTTDVQL/GCALFPGVLRKRAAPVDCLRPS 
AUT W RQEQ I G C CG AACAALRS * DS HKC * EG I S GDKVE I D P VTNQ 
KASTKFWIKQKPISIDSDLLCAC\DliAEE 


6678 ' 


221 


865 


GPSNQSSGSLSLIVTGCSSYWS*INDTCTILRVLSSNFGRQ*LR 
P FP CS QLPMS QGCLWHLDCCCPWVP Y I PGQQWRKGRQRMRN * QS 
LLGSDQESVGLEDLCVFVNFLLHVLLGLFP*PHELFLLPWDLG 
FLFPLLLQGG CHCL VLPANL VSQAPQ I GKLS CRLQTHDLEGS RN 
HHPLFL WGRWDAVKHIiE T VQS GLAS LG FVGQHTS HGPP 


6679 


2 


786 


LEFARGAMPFLGQDWRS PGQNWVKTVDGWKRFIiDEKSGSFVSDL 
SS YCNKE VYNKENLFNS LN YD / S CSQE E KEGHAE * QNQNS \D FH 
QEKW I YVHKGSTKERHGYCTLGEAFNRLDFSTAI LDSRRFNYW 
RLLELIAKSQLTSLSGIAQKNFMNIliEKVVLKVLEDQQNITLIR 
ELLQTLYTSLCTLVKRVGKSVLVGNINMWVYRMETILHWQQQLN 

NTOTT'RV t ?nOAriPPPnQrtQT.WPnTrtOTPrin'R r PPTPVTPPQnT 1 P 
v* i^J. i is. v ouynyrrrbooLJiinKlJ i\J\J 1 rcyur n>r j,rvl CtdO\jUc 


6680 ! 


1498 


2951 


PLCTLPLMPSALPGWAGERWEKQWPLA/ PGPGTWQTPVGS ISE3 
P\RKNEPDTHCPRGE ARPEV* HLP KPHS PGSEGAE IQTSA*ALP 
/NQVSPPQPM*GAEENGDQRGGKEEAGEELHRSSSGLTAAPGFP 
EVHRNLQTFPGLPSRGGGP/GGAGTQGSWAPGEQPP/SPLLPAS 
MQRSQAGLPG WEAGLVES PTHHI PALRPSGTNATGEAFPSTTCS 
SGP\PAPPGPTGLRPGGGSSSGGHG**PGLPVGKV\GALGAAQD 
PQSQGRGPTQGTVGTEMLLSGLGSAKACPAARPAVP*LPSDPAS 
TIPKKGTRGFGEGPGVLQERNRWWGRAQGFTSADAAGTAPPGV 
* LPAPLSQPPGATEPQVRACGMAPPS PGTSGRLVAWGRHPGPQV 
AQGCPPGAGCWGSQPRGSQRCPRTYTHSPLGHGRAPCPRRCWH* 
WQDP P S S PRTGCL PG I PARQ AYS APRTRS RPG I RTGRAAYG FIR 
FQGGGGG 
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amino acid 
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amino acid 
sequence 


Predicted end 
nucleotide 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A«Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=*Threonine , V«Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6681 


1169 


511 


INYIYYNQQQRAFHELK\EKLMSAPALGLPDLTKLFTLHVSERE 
KMTVGVLTQTVGP WSRPGAYLS KQLDGVS KGWPPCPRALAATAL 
LAQ3ADELTLRQNLNRKSPHA\WTLINTKGHH*LINARLTRYQ 
TLLCENPHKT IEVSNT/ LNPATLLLVTES PVKHNCLEVLDS VYS 
S RPNLRDHP * TS VDWELYVDGSGFANPCKVTLKKETS PAPVTPR 
S 


6682 


109 


1238 


T VLCGAMQ VS S LNE VKI YSLS CG KS LPE W LS DRKKRALQKKDVD 
VRRR I EL I QD FEM PT VCTT I KVS KDGQ Y I LATGT YKP RVRC YDT 
YQLSLKFERCLDSEWTFEILSDDYSKIVFLHNDRYIEFHSQSG 
FYYKTRIPKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYLN 
P LQTDAAE NNVCD I NS VHGL FATGT I EGRVE CWDPRTRNRVGLL 
D\AP*TVSQQIQR*TSLPTISALKFN\GALTMAVGTTTGQVLLY 
DLRS DKPLLVKDHQ YGL P I KS VH FQDSLDL I L S ADS R I VKMWNK 
NSGKIFTSLEPEHDLNDVCLYPNSGMLLTANETPKMGIYYIPVL 
GPAPRWCS FLDNLTEELEENPESNE 


6683 


109 


1238 


TVLCGAMQ VS S LNE VK I YS LS CGKS LPEWLS DRKKRALQKKDVD 
VRRR I EL I QD FEMPTVCTTI KVS KDGQY I LATGT YKPRVRC YDT 
YQLS LKFE RCLDS E WT F E I LS DD YSKI VFLHNDRYI B FHS QSG 
FYYKTRIPKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYLN 
PLQTDAAENNVCDINS VHGLFATGT I EGRVE CWDPRTRNRVGLL 
D\AP * TVSQQ 1 QR* TSLPTI SALKFN\GALTMAVGTTTGQVLLY 
DLRSDKPLLVXDHQYGLPIKSVHFQDSLDLILSADSRIVKMWNK 
NSGKI FTSLE PEHDLNDVCLYPNSGMLLTANETPKMGI YYI PVL 
GPAPRWCS FLDNLTEELEENPESNE 


6684 


111 


52 7 


GLRGGTSRGRAGREPEFAAGVLCWAGFCQSPCPPGGRGREAPA 
PP\SGRRHA*RPA*WLGGPGGDSGGREEGGS / GELQRAMESKMG 
ELPLDINIQEPRWDQSTFLGRARHFFTVTDPRNLLLSGAQLEAS 
RNIVQNYR 


6685 


258 


1473 


KLLGDNFEGFCNKFELSDS ENGSNS *QSPL\ FDRLFDPDPQKVL 
QGVI DMKNAV I GNNKQKANL I VLGAVPRLLYLLQQETS S TE LKT 
ECA WLGSLAMGT ENNVKS LLD CH 1 1 PALLQGLLS PDLK F I EAC 
LRCLRT I FTS P VT PE ELL YTDAT V I PHLMALLS R SR YTQE Y I CQ 
IFSHCCKGPDHQTILFNHGAVQNIAHLLTSLSYKVRMQALKCFS 
VLAFENPQVSMTLWVLVDGEUJPQIFVKMLQRDKPIEMQLTSA 
KCLT YMCRAGAI RTDDNC I VLKTL P CLVRMCS KE RLLEER VEGA 
ETLAYL I EPD VELQR IAS I TDHLIAMLAD YFKY P S S VS AI TD I K 
RLDHDLKHAHELRQAAFKLYASLGANDEDIRKKVSLGEGRPPVL 
TASRQGVTST 


6686 


310 


927 


DS VTFDD LAVD FTP KEWTLLD PTQRNL YRDVML ENY KNLATVG Y 
QLFKPSLISWLEQEESRTVQRGDFQASEWKVQLKTKELALQQDV 
LGEPTS SGIQMIGSHNGGE VSDVKQCGDVSSEHS CLKTHVRTQN 
S ENTFE C YLYGVD FLTLHKKTS TGEQRS VFS HVWKKPS S LNPD V 
VCQKNRCTRKKKAF * LQLTLGKSFH+ SIHT 


6687 


181 


915 


EAMLEAPYKKEEDEQQRKEVKKDYPSNTTSSTSNSGNETSGSST 
I GETSNRS RDRDR YRRRNS RS RSPGRQ CRHRSRS WDRRHGS E S R 
SRDHRREDRVHYRS PPLATGE P VDNLS PEERDARTVFCMQLAAR 
IRPRDLEDFFSAVGKVRDVRIISDRNSRRSKGIAYVEFCEIQSV 
PLAIGLTGQRLLGVPIXVQASQAEKNRLAAMANNIjQKGNGGPMR 
LYVGS LHFN I TE DMLRG I FE P FGKV 


6688 


1025 


1 


AEVPWYPRVFHKCPDSCWRFKFQPIQLQPYILLSFSSEKPPISF 
SEPGLPR/SATARMATAAAPPNSSIDLPSDSGMGFISPAGDSLD 
LPSDGGTGFFSLAGDSSSTRLSSLAFISFSLSSVSVGSSAGTTS 
STSVGSWAAFTSSSSSSTNRDVAGLDFSTVITSVSGSLVPSRE 
VAVICGSKGAGASGSASCSSRAGKTTEATAASSMPSGTSSFSTC 
TMSELEELFSLFSPAPLLSKLFTSSGS IAICCQDSGPSDTGRLS 
VCQLWLADSDTGKLSDCQEWTVGDSGGLTCPELSLGRM*MSLL 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline,' Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyroeine, X» Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








S SAVI PG YS S SSDSRLNTVPTVDLLCP FQTKS ST 


6689 


640 


1299 


S SSAS YATS ATS I SDTAFSGS LKLKHGLLSALDSS SRTS * STS S 
AEDSTFRICSPSVSDTSSDSSGSKDNVLILFSKVSI*SCFSLSS 
FFSDSISFCFSSSSFCKR*FVSSKVSQNALLSSRLSNGPGGSSK 
QRNSLTARQIAMSL* ATKF*RNACNPNCLSSKKSAL* LS LNQRF 
GGSAS RKPGNI S FNS QKCSALS YCCNFVI KPREVS VSS ENYPAF 


6690 


1 


442 


GTRGKMAATLGPLGSWQQWRRCLSARDGSRMLLLLLLLGSGQGP 
QQVGAGQTFEYLKREHSLSKPYQGVGTGSSSLWNLMGNAMVMTQ 
YIRLTPDMQS KQGAL WNR VPCF LRDWELQVHFK I HGQG KKNL \ H 
GDGLAIWYTKDRMQP 


6691 


287 


1401 


LKTETSEEKARRYKDRPSQLNAVFQEQKKMIQAQESITLEDVAV 
D FTWE EWQLLGAAQKD L YRDVM LENYSNL VAVG YQAS KP DALFK 
LEQGEQLWT I E DGIHSGAC SD I W KVDHVLE RLQS ESL VNRRKPC 
HEHDAFEN I VHCS KS QFLLGQNHD I FDLRG KS LKSNLTLVNQS K 
GYE I KNSVEFTGNGDS FLHANHERLHTAI KFPASQKLISTKSQF 
I S PKHQKTRKLEKHHVCS ECGKAF I KKS WLTDHQ VMHTGEKPHR 
CSLCEFCAFSRKFMLTEHQRTHTGEKPYECPECGKAFLKKSRLNI 
HQKTHTGEKPYICSECGKGFIQKGNLIVHQRIHTGEKPYICNEC 
/ GKGFIQKTCLIAHQRFHTER 


6692 


178 


939 


WIKEGELSLWERFCANIIKAGPMPKHIAFIMDGNRRYAKKCQVE 
RQEGHSQGFNKLAETLRWCLNLG I LEVTVYAFS I ENFKRS KSEV 
DGLMD LARQKFS RLME E KE KLQKHG VCIR VLGDLHLL PLDLQEL 
IAQAVQATKNYNKCFLNVC FAYTS RHE I SNAVREMAWG VEQGLL 
DPSDISESLLDKCLYTNRSPHPDILIRTSGEVRLSDFLLWQTSH 
SCLVFQ PVLW PE YTFWNL FEAI LQ FQMNHS VLQ K 


6693 


178 


939 


WIKEGELSLWERFCANIIKAGPMPKHIAFIMDGNRRYAKKCQVE 
RQEGHSQGFNKLAETLRWCLNLG I LEVTVYAFS I ENFKRS KSEV 
DGLMDLARQKFS RLMEEKE KLQKHGVCIRVLGDLHLLPLDLQEL 
IAQAVQATKNYNKCFLNVCFAYTSRHEISNAVREMAWGVEQGLL 
DPS D I S ESLLD KC LYTNRS PHPD I L I RTSGE VRLSDFLLWQ TSH 
SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6694 


292 


813 


SLLLHLAPPGAYT PSQPLSSVSTETAS SVRRQAAESRQHELPVR 
EVHSLGQILPQDGLTAEAGPPEAQDPWGSPGISLPAAHIGFAAA 
LAVGPSGCHTE P \ FDEVWPSLFLGDAYAARDKSKLI QLGI THW 
NAAAGKFQVDTGAKFYRGMSLEYYG I EADDNPFFDLS VYFLP 


6695 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR 
E VHSLGQ I LPQDGLTAE AGP PEAQD P WGS PG I S L PAAH I GFAAA 
LAVGPSGCHTEP \ FDEVW P S L FLGD AYAARDKS KL I QLG ITHW 
NAAAGKFQVDTGAKFYRGMSLEYYG I EADDNPFFDLS VYFLP 


6696 1 


1 


782 


PRVRGRVGERWAFLSVPAAMSSEMEPLLLAWSYFRRRKFQLCAD 
LCTQML E KS P YDQ AAWI LKARALTEM VY I DE I D VDQEG IAEMML 
DENAI AQVPRPGTSLKLPGTNQTGGPS QAVRP ITQAGRP I TGFL 
RPSTQSGRPGTMEQAIRTPRTAYTARPITSSSGRFVRLGTASML 

t* c? r>r"i^ n o t ktt otst kit Tvvpftvnw Tm/TntTTtmtTMTT/m* •» »■» 
i bFLKj.fr IKJjbivb^lJ-tX ill bUKj^KJ^AKAXiIEYIFHHENnVKTAT.n 

LAALSTEHSQYKDWWWK/DQIEKCYYRVGMYREAEKQIKSS 




3 


782 


PPLFLRRLNSRALRPGSRKVMAWPASLSGQDVGS FAYLTI KDR 
IPQILTKVIDTLHRHKSEFFEKHGEEGVEAEKKAISLLSKLRNE 
LQTDKP F I P LVE KF VDTD I WNQ YLE YQQS LLNE S DG KS RWF YS P 
WLLV\ ECYMYRRIHEAI \ I QS P PIDYFDVFKES KEQNF YGSQES 
1 1 ALCTHLQQL I RT I EDLD \ ENQLKDE FF KLLQ I S LWGE I S VDL 
SL\SGGESSSQNTNVLNSLEDLKPFILLNDMEHLWSLLSNCK 


6698 


668 


754 


VGSCACAGSCKCKECKCTSCKKSECRAFP 


6699 


325 


492 


EGELP/PARRVLPRAMTASAQPRGRRPGVGVGVWTSCKHPRCV 
LLGKRKGS VGAGS FQLPGGHLE FGETWE ECAQRETWEEAALHLK 
NVHFASWNSFIEKENYHYVTILMKGEVDVTHDSEPKNVEPEKN 
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Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
Jj=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








ESKRIIYNHAFFFQESKWSGGILQ 


6700 


1098 


1392 


TQCWRS STPGMRTHFRTQP / RLECGQGFSQQENGHCMDTNECIQ 
FPFVCPRDKPVCVNTYGSYRCRTNKKCSRGYEPNEDGTACVERT 
LLLGLCNLLGK 


6701 


2 


1485 


AAAGPRTRVRRAAAFEGQPSPSPGLGPTSDKAAAPRTPKRRRLW 
RQRQ /HPAMLCYVTRPDAVLMEVE VEAKANGEDCLNQVCRRLGI 
I EVD Y FGLQFTGS KGES LWLNLRNR I S QQMDGLAP YRLKLRVKF 
FVEPHLILQEQTRHIFFLHIKEALLAGHLLCSPEQAVELSALLA 
QTKFGD YNQNTAKYNYEELCAKELS S ATLNS I VAKHKE LEGTS Q 
AS AE YQVLQ I VS AM ENYG I E WHS VRDS EGQKLL I G VGPEG I S I C 
KDDFSPINRIAYPWQMATQSGKNVYLTVTKESGNSIVLLFKMI 
STRAASGL YRA I TETHAF YRCDTVTS AVMMQYS R DLKGHLAS LF 
IiNENINLGKKYVFDI KRTSKEVYIDHARRALYNAGVVDLVSRNNQ 
SPSHSPLKSSESSMNCSSCEGLSCQQTRVLQEKLRKLKEAMLCM 
VCCEEEINSTFCPCGHTVCCESCAAQLQVGESAAHFCLQPHLSL 
LLTGSRSQVLAR 


6702 


397 


1971 


PLAKFLKLDLVNVLCLPMEDVFLFYRTCFCSMGLGSSCHLSLPK 
RAEALLCSRKATWRDLVAVRMAEEQEFTQLCKLPAQPSHPHCV 
NNTYRSAQHSQALLRGLLALRDSGILFDWLWEGRHIEAHRIL 
LAASCDYFKGMFAGGLKEMEQEEVLIHGVSYNAMCQILHFIYTS 
ELiELSLSNVQETLVAACQLQIPEIIHFCCDFLMSWVDEENILDV 
YRLAELFDLSRLTEQLDTYILKNFVAFSRTDKYRQLPLEKVYSL 
LS SNRLEVS CETEVYEGALL YH YSLEQVQADQI SLHE PP KLLET 
VRFPLMEAEVLQRLHDKLDPSPLRDTVASALMYHRNESLQPSLQ 
SPQTELRSDFQCWGFGGIHSTPS\MSSATRPKYLNPLLGEWKH 
FTASLAPRMS NQG I AVLNNFVYL I GGDNNVQGFRAE S RCWR YD P 
RHNRWFQ I QS LQQEHADL S VC WGR Y I YAVAGRD YHNDLNAVER 
YD P ATNS WAYVAPLKRE V YAHAGATLEGKMY ITCGR KGR I T 


6703 


45 


1244 


G VG PRAAAM P LE LELCPGRW VGGQHPC F 1 1 AE I GQNHQGDLD VA 
KRMIRMAKECGADCAKFQKSELEFKFNRKALERPYTSKHSWGKT 
YGEHKRHLEFSHDQ YRELQRYAEE VG I FFTASGMDEMAVEFLHE 
LNVPFFKVGSGDTNNFPYLEKTAK/TRGWHSVLRDVCGVQLNDE 
TS S WD VLGR VRTS KE KVLMVLVLD YSGRPMVI SS GMQ SMDTM KQ 
VYQIVKPLNPNFCFLQCTSAYPLQPEDVNLRVISEYQKLFPDIP 
IGYSGHETGIAISVAAVALGAKVLERHITLDKTWKGSDHSASIjE 
PGELAELVRS VRLVERALGS PTKQLLP CEMACNE KLG KS WAKV 
KIPEGTILTMDMLTVKVGEPKGYPPEDIFNLVGKKVLVTVEEDD 
TIMSE 


6704 


82 


1007 


TMNTRNRWNSGLGASPASRPTRDPQDPSGRQGELSPVEDQREG 
LEAAPKGP SRES WHAGQRRTSAYTL IAPN INRRNB I QR IAEQE 
LANLEKWKEQNRAKPVHLVPRRLGGSQSETEVRQKQQLQLMQSK 
YKQKLKREESVRIKKEAEEAELQKMKAIQREKSNKLEEKKRLQE 
NLRREAFREHQQYKTAE FL /RQTEHR IARQKCLS KCCLW PT I LN 
MGQKLGLQ\DSLKAEENRKLQKMKI)EQHQKSELLELKRQQQEQE 
RAKIHQTEHRRVNNAFLDRLQGKSQPGGLEQSGGCWNMNSGNSW 
GI 


6705 


2 


786 


RLCRNSARVPCGWSASRSLGEGAGFIGPLRGPHPRAGGTGTSFT 
S YKRKGGIMSTIAAFYGGKS IL ITVATGFIiGKELMEKLFRTSPD 
LKVI YI LVRP KAGQTLQHRVFQ I LDS KLFEKVI E VRPNVHEKIR 
AI YADLNQNDFAISKEDMQELLSCTNI IFHCAATVRFDDTLRHA 
VQLimrATRQLLLl^SQMPKLEAFIHISTAYSNCNLKHIDEVIY 
PCPVEPKKIIDSLEW\LDDAIIDEITPKLIRDWPNIYTYTK 


6706 


130 


531 


FTHSSSSHSQEMLGKLNMLRNDGHFCDITIRVQDKIFRAHKVVL 
AACS D F FRT KL VGQAEDENKNVLDIjHHVTVTG F I PLL E YAYTAT 
LS INTENI IDVLAAASYMQMFSVASTCSEFMKSSILWNTPNSQP 
EK 
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amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Hi3tidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Kethionine < N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6707 


2233 


1343 


YWSGIGYELQHFHWRKFHFEKKGPPSTCQERLYESRSRWPCIS* 
GMVWGWTAVNGSW*GGQLRCVCVCTSHSSDSTRSSQRASKCHS 
FF I LSQ * KT * S S WENWVFAKYS R I YS YGHS CS KGRGD * D FK*NV 
SOAR * SR FCGLCNP CGHCGLD I NLRGGS S PWTD KHS CVHNNLLC 
NRRVFSLLCEG PGHCYQGAVCREACAAAS PGLD S AAE PHRLCEH 
TD*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYA 
C+RCHWYFEWLLYNHCGDILVACL+RRQL*SSQ 


6708 


115 . 


1729 


TVGSWSRSGRSPPVGRQLLLTGRGAQAAGSPQGGMALQVELVPT 
GE I IRVVHPHRPCKLALGSIX3VRVTMESALTARDRVGVQDFVLL 
ENFTSEAAFIENLRRRFRENLIYTYIGPVLVSVNPYRDLQIYSR 
QHMERYRGVS F YEEP PHLLAVADT VYRALRTERRDQAVM I S VES 
GAGKTDATKRLLQLYAETCPAPQRGGAVRDRLLQSNPVLEAFGN 
AKTLRNDNS S R FGKYMD VQ FD FKGAP VGGHI LS YLLEKS RWHQ 
NHGERNFHIFYQLLEGGEEETLRRLGLERNPQSYLYLVKGQCAK 

TfC O T\mvCTMJinn7DVTl T Tn7TriT?TT?T\'C\7TmT T CTIVJVOT7T UT nMTU 

VSS INLJKSDWKV VKKAJjI V lUt J. huh VhUuLtO ±AAbvbnlA*JNi.H 

FAANEESNAQVTTENQLKYLTRLLSVEGSTLREALTHRKIIAKG 
EELLSPLNLEO^YARDALAKAVYSRTFTWIiVGKINRSLASKDV 
ES P S WRSTTVLGLLE I YGFE VFQHNS FEQ FC I N YCNE KLQQ LF I 
ELTLKSEQEEYEAEGIAWEPVQYFNNKI ICDLVEEKFKGI I \S I 
LDE\ECLRPGE 


6709 


3 


894 


PPHEHLFPSGERGPFSFLVSRRGLGPGKMGKKGKKEKKGRGAEK 
TAAKMEKKVS KRSRKEEEDLEAL IAHFQTLDAKRTQTVELPCP P 
PSPRLNASLSVHPEKDELILFK3GEYFNGQKTFLYNELYVYNIRK 
DTWTKVD I PS P P PRRCAHQAVWPQGGGQLWVFGGEFAS PNGEQ 

DVtlWIM IiHrr UT hTI/TMt?^MrVO'Pr , <^Ticr , t3 0r , tIDNnfT\MT/T3AT TT C 

r Yn i KULWVJjxiXiA. I K. 1 W h \2 V Jxb 1 vjvj F b bKbvHKIWAWKj<U.ulIjfc 
GG FHE S TRD Y I Y YNDVYAFNLDTFTW S KLSP S GTG PT PRSGCQ \ 
I PS LPRAAS S VYGGYS KQRVKKDVDKGTRHSDMF 


6710 ~ 


158 ' 


980 


RHKMTNYRVESSSGRAARKMRIiALMGPAFIAAIGYIDPGNFATN 

T/"\7\ /"'7\ PtVUAT T T»TT n 7"l TTd 71 MT K/TTV MT TATT OTAI/TYIT 11V3ITMT 71 TT»/"\ Y 

IQAGAbrXjYQJuJjWV VVWiy^JjM/«nJj 

RDHYPRPWWF YWVQAE I IAMATDLAE FIGAAIGFKLI LGVSLL 
QGAVLTG I AT FL I LMLQRRGQKPLEKVI GGLLLFVAAAY I VEL I 
FSQ PNLAQLG KGMVI PSLPTS EAVFLAAGVL \GATIMPHVI / YI 
WHS S LTQHLHGGSRQQR YS ATKWD VA I AMTIAG FVNLAI MATAA 
SELNFYGHTGVA 


6711 


3 


347 


VTECKTMTCKMSQLERNI*TMINTLHHYSVKLGHPDTLIHGEFK 
ELVRTDLHN I LM KENKNDQAI * H I MEDLDTNAHMQ 1 1 FKEL IML 
MAMLTWSYHDNMHDADYGPGQQHRPG 


6712 


118 


578 


PHGQKRTRYPQVRAPGQQPQAQIAMALCLKQVFAKDKTFRPRKR 
FEPGTQRFELYKKAQASLKSGLDLRSWRIiPPGENIDDWIAVHV 
VDFFNRINLIYGTMAERCS*TSCPVMAGGPRYEYRWQDERQYRR 
PAKLS APR YMALLMDW I ESL I 


£713 


2485 


3 


QARGS DS EDGE FE IQAEDDARARKLGPGRPLPTFPTSECTSDVE 
PDTREMVRAQNKKKKKSGGFQSMGLSYPVFKGIMKKGYKVPTPI 

ARAL I LS PTRE LALQTLKFTKELG KFTGLKTAL I LGGDRMEDQ F 
AALHENPDI I IATPGRLVHVAVEMSLKLQS VEYWFDEADRLFE 
MGFAEQLQEIIARLPGGHQTVLFSATLPKLLVBFARAGLTEPVL 
IRLDVDTKLNEQLKTS FFLVREDTKAAVLLHLLHNWRPQDQTV 
VFVATKHHAE YLTE LLTTQR VS CAH I YS ALDPTARK I NLAKFT L 
GKCSTLI VTDIiAARGLDIPLLDNVINYS FPAKGKLFLHRVGRVA 
RAGRSGTAYSLVAPDEIPYLLDLHLFLGRSLTLARPLKEPSGVA 
GVDGMLGRVPQSWDEEDSGLQSTLEASLELRGLARVADNAQQQ 
YVRSRPAPSPESIKRAKEMDLVGLGLHPLFSSRFEEEELQRLRL 
VDS I KNYRSRAT I FEINAS SRDLCSQVMRAKRQKDRKAIARFQQ 
GQQGRQEQQEGPVGPAPSRPALQEKQPEKEEEEEAGESVEDIFS 
EWGRKRQRSGPNRGAKRRREEARQRDQEFYTPYRPKDFDSERG 
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corresponding 
to first 
amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NaAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








LSISGEGGAFEQQAAGAVLDLMGDEAQNLTRGRQQLKWDRKKKR 
FVGQSGQEDKKKIKTESGRYISSSYKRDLYQKWKQKQKID*S*L 
GRRRG ILTRRR PRTE EVGEAR PLAQAGCI PGPHAPRHPLQAES A 
LELKTKQQILKQRRRAQKAALSLQRWWPQAALCPQ 


6714 


169 


1416 


NNCQELLPPPPAPMAHI PSGGAPAAGAAPMGPQYCVCKVELSVS 
GQNLLDRDVTSKSDPFCVLFTENNGRWIEYDRTETAINNLNPAF 
SKKFVLDYHFEEVQKLKFALFDQDKS S MRLDEHDFLGQFSCSLG 
TIVSSKKITRPLLLLNDKPAGKGLITIAAQELSDNRVITLSLAG 
RRL D KKDL FG KS DP FLE F YKPGDDGKWML VHRTE VI KYTLD PVW 
KPFTVPLVSLCDGDMEKPIQVMCYDYDNDGGHDFIGEFQTSVSQ 
M CE ARDS VPL E FEC INP KKQ RKKKNYKNSG 1 1 ILRSCKINRDYS 
FLD Y I LGG CQ LM FTVG I D FT ASNGNP LDPS S LH Y INPMGTNE YL 
SAIWAVGQIIQDYDSDKMFPALGFGAQLPPDWKVSHEFAINFNP 
TNPFCSGVDGIAQAYSACLP 


6715 


32 


493 


G P AGAESGS LHCLPATVQ ALAGAAHS PHGGQP PRRGPL IGSGMP 
GKPKHLGVPNGRMVLAVSDGELSSTTGPC3GQGEGRGSSLSIHSL 
PSGPSSPFPTEEQPVASWALSFERLLQDPLGLAYFTEFLKKEFS 
AENVTFWKACERFQQI PAS DT 


6716 


1 


176 


GAGG PAPRS FGS E E PRAALERD KMS ARAAAAKS TAMEE TA I WEQ 
HT VTLHRVS L CCS K 


6717 


115 


896 


LFAMSGFENLNTDFYQTSYSIDDQSQQSYDYGGSGGPYSKQYAG 
YD YSQQGRFVPP DMMQ PQQ P YTGQ I YQ PTQAYTP AS PQ P FYGNN 
FEDEPPLLEELGINFDHIWQKTLTVLHPLKVADGSIMNETDLAG 
PMVFCLAFGATLLLAGKIQFGYVYGISAIGCLGMFCLLNLMSMT 
GVSFGCVASVLGYCLLPMILLSSFAVI FSLQGMVGI ILTAGI IG 
WCSFSASKIFISALAMEGQQLLVAYPCALLYGVFALISVF 


6718 


290 


599 


KQSSTVPGTILPSLKWHNSGLCKFPETGGKMTTFKEGLTFKDVA 
VIFTEEEIX3LLDPVQRNLYQDVMLENFRNLLSVGHHPFKHDVFL 
LEKEKKLDIMKTATQ 


6719 


1 


691 


PTRPE EQDREDGKCHKMEMNP I SGNLNCD P IAMS Q CS SDHG CE T 
DLDS DDDK IE KPNNFMKDS AS QDNGLS RKI SRKR VCSSDS DSSh 
QWKKSSKARTGLLRITRRCAATAANKIKLMSDVEDVSLENVHT 
RS KNG RKKPLHLACTTAKKKLS DCEG S VH CEVPS EQ YACEG KP P 
DPDSEGSTKVLSQAIiNGDSDSEDMLNSEHKHRHTNIHKIDAPSK 
RKSSSVTSSG 


6720 


3 


822 


HEVAEEAGGTVYPQRGTMPGTKRFQHVIETPEPGKWELTGYEAA 
VP I TE KSNPLTQDLDKADAEN I VRLLGQ C DAE I FQE EGQALS T Y 
QRLYSES I LTTMVQVAGKVQEVLKEPDGGLWLSGGGTSGRMAF 
LMS VS FNQLMKGLGQKPLYTYLI AGGDRS WAS REGTEDSALHG 
I EELKKVAAG KKRV I VIG I S VGLS AP FVAGQMD C CMNNTAVFL P 
VLVGFNPVSMARHPFPPPRILRSLTVFPSLRAPHYQITSLLFSM 
SWTLISE 


6721 


3 


822 


HEVAEEAGGTVYPQRGTMPGTKRFQHVIETPEPGKWELTGYEAA 
VP I TE KSNP LTQ DLD KADAEN I VRLLGQ CDAE I FQ EEGQAL S TY 
QRLYS ES I LTTMVQVAGKVQE VLKEPDGGLWLSGGGTSGRMAF 
LMSVS FNQLMKGLGQ KPLYTYL I AGGDRS WASREGTEDSALHG 
IEELKKVAAGKKRVIVIGISVGLSAPFVAGQMDCCMNNTAVFLP 
VLVGFNPVSMARHPFPPPRILRSLTVFPSLRAPHYQirSLLFSM 
SWTLISE 


6722 


1 


390 


RSWSKRTWQALPMAVLFLLLFLCGTPQAADNMQAIYVALGEAVE 
LPCPSPSTLHGDEHLSWFCSPAAGSFTTLVAQVQVGRPAPDPGK 
PGRESRLRLLGNYSLWLEGSKEEDAGRYWCAVLGQIfflNYQNW 


6723 


173 


659 


VCQYCTARMADFG I SAGQFVAWWDKS S P VEALKGLVDKLQALT 
GNEGRVSVENIKQLLQSAHKESSFDIILSGLVPGSTTLHSAEIL 
AE IAR I LRPGGCLFLKE PVETAVDNNS KVKTASKLCSALTLSGI, 
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VEVKELQREPLTPEEVQSVREHLGHESDNL 


6724 


173 


659 


VCQ YCTARMADFG I SAGQFVAWWDKSS PVEALKGLVDKLQALT 
GNEGRVS VENI KQLLQSAHKESS FDI ILSGLVPGSTTLHSAE I L 
AEIARILRPGGCLFLKEPVETAVDNNSKVKTASKLCSALTLSGL 
VEVKELQREPLTPBEVQSVREHLGHESDNL 


6725 


356 


722 


RRRTP P VI LATMDDDLMLALRLQE E WNLQ EAE RDHAQE S IjSLVD 
ASWELVDPTPDLQALFVQFNDQFFWGQLEAVEVKWSVRMTLCAG 
ICSYEGKGGMCSIRLSEPLLKLRPRKDLVEVFFV 


6726 


98 


714 


HLQKMERKINRREKEKEYEGKHNSLEDTDQGKNCKSTLMTLNVG 
GYLYITQKQTLTKYPDTFLEGIVNGKILCPFDADGHYFIDRDGL 
LFRHVLNFLRNGELLLPEGFRENQLLAQEAE FFQLKGLAEEVKS 
RWE KEQLT PRETTFLE I TDNHDRS QGLR I FCNAPDFI S KI KSR I 
VLVSKSRLDGFPEEFSISSNIIQFKYFIK 


6727 


1 


831 


FRGMGDERPHYYGKHGTPQKYDPTFKGPIYNRGCTDI ICCVFLL 
LAIVGYVAVGIIAWTHGDPRKVIYPTDSRGEFCGQKGTKNENKP 
YLF YFNI VKCAS PLVLLE FQCPTPQ I CVEKCPDR YLT YLNARSS 
RDFEYYKQFCVPGFKNNKGVAEVLRDGDCPAVLIPSKPLARRCF 
PAI HAYKGVLMVGNETT YEDGHGSRKNITDLVEGAKKANG VLEA 
RQLAMRIFEDYTVSWYWDI ISLGIAMAMSLLFI ILLRFLAGIMG 
RGMIIMGILVLGY 


6728 


486 


935 


FCSSWLRSLADSSLSWKMFLVGLTGGIASGKSSVIQVFQQLGCA 
VI DVD VMARHWQPG YPAHRRI VEVFGTEVLLENGDINRKVLGD 
LI FNQPDRRQLLNAITHPE IRKEMMKETFKYFLREPRTS PRGKK 
HVPS ALKEADS LMRRDT 


6729 


259 


1191 


VG LTGAQSGRTAS MGRDQRAVAGP ALRRWLLLGTVTVGFLAQS V 
LAGVKKFDVPCGGRDCSGGCQCYPBKGGRGQPGPVGPQGYNGPP 
GLQGFPGLQGRKGDKGERGAPGVTGPKGDVGARGVSGFPGADGI 
PGHPGQGGPRGRPGYDGCNGTQGDSGPQGPPGSEGFTGPPGPQG 
PKGQKGEPYALPKEERDRYRGEPGEPGLVGFQGPPGRPGHVGQM 
GPVGAPGRPG PPGPPGPKGQQGNRGLGFYGVKGE KGDVGQPGPN 
G I PS DTLH P 1 I AP TG VT FHPDQ YKGE KG S EGE PG I RGI S L KGEE 
GIM 


6730 


784 


1015 


NMVDYYEVLGLQRYASPEDIKKAYHKVALKWHPDKNPENKEEAE 
RKFKEVAEAYEVLSNDEKRDIYDKYGTEGLNEF 


6731 


1 


446 


GIRKRLHGAWPRVEVGCPWETRESEGVHLERPTSPLKNNDEGS 
LD I YAGLDS AVS DS AS KS C V P S RNCLDL YE E I LTEEGTAKEAT Y 
NDLQVEYGKCQLQMKELMKKFKEIQTQNFSLINENQSLKKNISA 
LI KTARVE INRKDEE I 


6732 


102 


1205 


GRWQRRPPPPSPPLWCLQPGGGSDPQQLTQLRHCLSHSPQDTPW 
AQRQVCYTAATTQAAAPATRNCLPDHSGHRPTPPRSHRHHRQEN 
LGS IKPSSRSTKATSTTMAGDGRRAEAVREGWGVYVTPRAP IRE 
GRGRLAPQNGGS S DAPAYRT P P SRQGRREVR FS DE P PE VYGDFE 
PLVAKERSPVGKRTRLEEFRSDSAKEEVRESAYYLRSRQRRQPR 
PQETEEMKTRRTTRLQQQHSEQPPLQPSPVMTRRGLRDSHSSEE 
DEASSQTDLSQTISKKTVRSIQEAPAVSEDLVIRLRRPPLRYPR 
YEATSVQQKVNFSEEGETEEDDQDSSHSSVTTVKARSRDSDESG 
DKTTRSSSQYIESFW 


6733 


613 


1311 


RSCRQVGMRSRNQGGESASDGHISCPKPSIIGNAGEKSLSEDAK 
KKKKSNRKEDDVMASGTVKRHLKTSGECERKTKKSLELSKEDLI 
QLLSIMEGELQAREDVIHMLKTEKTKPEVLEAHYGSAEPEKVLR 
VLHRDAI LAQEKS IGEDVYEKP I S ELDRLEEKQKETYRRMLEQL 
LLAEKCHRRTVYELENEKHKHTDYMNKSDDFTNLLBQERERLKK 
LLEQEKAYQARKE 


6734 


189 


551 


SAAMFPVFSGCFQELQEKNKSLELVSFEEVAVHFTWEEWQDLDD 
AQRTL YRD VMLET YS S L VSLGH CI TKPEM I FKLEQGAE P WI VBE 
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TLNLRLSGGSKKQVFSG ICHRSLVELQEVHLV 


6735 


280 


558 


KSRRAGVTKMSNPFLKQVFNKDKTFRPKRKFEPGTQRFELHKKA 
QASLNAGLDLRLAVQLPPGEDLNDWVAVHWDFFNRVNLIYGTI 
XDGCT 


6736 


195 


808 


MNYELNFKREMPNIKSLGLTNLNFLLKRLSSVLPLITDYVYFEN 
S S S NP YL I RR I EELN KTASGNVEAKWCFYRRRD I SNTL I MLAD 
KHAKEIEEESETTVEADLTDKQKHQLKHRELFLSRQYESLPATH 
IRGKCSVALLNETESVLSYLDKEDTFFYSLVYDPSLKTLLADKG 
EIRVGPRYQADIPEMLLEGTFFCVFAVL 


6737 


150 


1209 


PVIMPLHFSPGDIVRPSCCVSSSPKLRRNAHSRLESYRPDTDLS 
REDTGCNLQHISDRENIDDLNMEFNPSDHPRASTIFLSKSQTDV 
REKRKSLFINHHPPGQIARKYSSCSTIFLDDSTVSQPNLKYTIK 
CVALAI Y YH I KNRDPDGRMLLD I FDENLH PLS KSE VP PD YDKHN 
P EQ KQ I YRFVRTLFS AAQLTAE CAI VTLVYLERLLT YAE IDI C P 
AN W KR I VLGA I LLAS KVWDDQAWNVDYCQ I LKD I TVEDMNELE 
RQFLELItfFNINVPSSVYAKYYFDLRSLAEANNLSFPLEPLSRE 
RAH KLE A I SR LCEDKYKDLRRS ARKRS AS ADNLTL PRW S PAI I S 


6738 


148 


£53 


CACAE Q P ARAE VGAAT AL PVR WAS G S MAPS G S LAV P LAVL VLLL 
WGAPWTHGRRSNVRVI TDENWRELLEGDWMI E FYAP WC PACQNL 
QPEWESFAEWGEDLEVNIAKVDVTEQPGLSGRFIITALPTIYHC 
KDGEFRRYQGPRTKKDFINFISDKEWKSIEPVSSWF 


6739 


3 


631 


SWPDMAEEEVAKLEKHLMLLRQEYVKLQKKLAETEKRCALLAAQ 
ANKE S S S E S F I S RLLA I VADLYE QEQYS DLKI KVGDRH I SAHKF 

VLAARS ds ws lanls s tkeldls danp evtmtmlrw I YTDE LEF 

REDD VFLTELMKLANRFQLQLLRERCE KGVMS LVNVRNC I R F YQ 
TAEELNASTLMNYCAE I IASHWVSEVEGVNKAL 


6740 


3 


631 


SWPDMAEEEVAKLEKHLMLLRQEYVKLQKKLAETEKRCALLAAQ 
ANKE S S S ES F I S RLLAI VADL YEQEQ Y SDLK I KVGDRH I SAHKF 
VLAARSDSWSLANLSSTKELDLSDANPEVTMTMLRWIYTDELEF 
REDDVFLTELMKLANRFQLQLLRERCEKGVMS LVNVRNC IRFYQ 
TAEELNASTLMNYCAE I IASHWVSEVEGVNKAL 


6741 


141 


960 • 


PLTL P FS SRARAGHTMNTS PGT VGS DP VILATAG YDHT VR F WQA 
HSGICTRTVQHQDSQVNALEVTPDRSMIAAAVQPVSLGYQHIRM 
YDLNSNNPNPIISYDGVNKNIASVGFHEDGRWMYTGGEDCTARI 
WDLRSRNLQCQRI FQVNAP INCVCLHPNQAELI VGDQSGAIH IW 
DLKTDHNEQL I PE PE VS ITS AHI DPDAS YMAAVNSTLVPFS CLL 
PLA IG I LQEGE FE S LARRGLLFLACQGNC YVWNLTGG IGDE VTQ 
LIPKTKIP 


6742 


141 


960 


PLTLPFS SRARAGHTMNTS PGTVGSDP VI LATAG YDHTVRFWQA 
HSG I CTRTVQHQDSQVNALEVTPDRSM IAAAVQPVSLGYQH I RM 
YDLNSNNPNPI IS YDGVNKNIAS VGFHEDGRWMYTGGEDCTARI 
WDLRS RNLQCQR I FQVNAP I NCV CLH PNQAEL I VGDQ SGAI H I W 
DL KTDHNEQL I PE PE VS I TS AH I DPDAS YMAAVNS TLVP FS CLL 
PLAIG I LQEGEFE S LARRGLLF LACQGNC YVWNLTGG I GDE VTQ 
LIPKTKIP 


6743 


1 


412 


MHSTQDKSLHLEGDPNPSAAPTSTCAPRKMPKRISISKQLASVK 
ALRKCS D LE KAI ATTAL I FRNS S DS DG KLE KAI AKDLLQTQFRN 
FAEGQETKPKYREILSELDEHTENKLDFEDFMILLLSITVMSDL 
LQNIR ' 


6744 


95 


1343 


RTPARNRCAGCEVLSRFSSPNKASSFALQSAGGGLPAVRALRRD 
RQKVSTVGYGMDEVEQDQHEARLKELFDSFDTTGTGSLGQEELT 
DLCHMLSLEEVAPVLQQTLLQDNLLGRVHFDQFKEALILILSRT 
LSNEEHFQEPDCSLEAQPKYVRGGKRYGRRSLPEFQESVEEFPE 
VT V I E PLDEEAR P SHI P AGD CS EHWKTQRS E EYE AEGQ LRF WNP 
DDLNASQSGSSPPQDWIEEKLQEVCEDLGITRDGHLNRKKLVSI 
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CEQYGLQNVDG EM LEE VFHNLD PDGTMS VEDFF YGLFKNG KS LT 
PSASTPYRQLKRHLSMQSFDESGRRTTTSSAMTSTIGFRVFSCL 
DDGMGHASVERILDTWQEEGIENSQEILKALDFGLDGNINLTEL 
TLALENELLVTKNS IHQACI 


6745 


1 


568 


TFRDQGWAQRRRWLLGCASWESWEAAIAAGPGLPSSTARQQNNP " 
AAGTECFAAVWARGTAMGSVLSTDSGKSAPASATARALERRRDP 
EL P VTS FD CAVCLE VLHQPVRTRCGHVFCRSCIATS LKNNKWTC 
P YCRAYL PS EG VP ATD VAKRMKS E YKNCAE CDTL VC LS EMRAH I 
RTCQKYIDKYGPLQELEETA 


6746 


110 


492 


GATG AMAE S APARHRRKRRS TP LTS S TLP S QATE KS S YFQTTE I " 

SLWTWAAIQAVEKKMESQAARLQSLEGRTGTAEKKLADCEKMA 

VEFGNQLEGKWAVLGTLLQEYGLLQRRLENVENLLRNRN 


6747 


247 


484 


EAVTFKDVAWFTEEELGLLDLAQRKLYRDVMLENFRNLLSVGH 
QPFHRDTFHFLREEKFWMMDIATQREGNSVYAGVC j 


6748 


201 


665 


MTTFKEAVTFKDVAVVFTEEELGLLDPAQRKLYRDVMLENFRNL 
LSVGNQPFHQDTFHFLGKEKFWKMKTTSQREGNSGGKIQIEMET 
VPEAGPHEEWSCQQIWEQIASDLTRSQNSIRNSSQFFKEGDVPC 
QIEARLS I SXVQQXPYRCNECKQ 


6749 


95 


719 


RREVKGGDGVCPRARGSPQSQQFPSCAGGGEGLQQSGEALDGAM " 
S AGG PCPAAAGGG PGGAS CS VGAPGG VS M FRWLEVLE KE FDKAF 
VDVDLLLGEIDPDQADITYBGRQKMrSLSSCFAQLCHiCAQSVSQ 
INHKLEAQLVDLKSELTETQAEKVVLEKEVHDQIiLQLHS IQLQL 
HAKTGQSADSGTIKAKLSGPSVEELERELKAN 


6750 


3 


428 


SCESRRPGAKWVWASGALPRDTTGLGSEQPSGDVAQSNRATMGT 
TAPGPIHLLELCDQKLMEFLCNMDNKDLVWLEEIQEEAERMFTR 
EFSKEPELMPKTPSQKNRRKKRRISYVQDENRDPIRRRLSRRKS 
RSSQLSSRR 


6751 


152 


1417 


PTKATEMAGAS VKVAVRVRP FNS REMSRDS KC 1 1 QMS GSTTTI V 
NPKQPKETPKS FS FDYS YWSHTSPEDINYASQKQVYRDIGEEML 
QHAFEGYNVCIFAYGQTGAGKS YTMMGKQEKDQQGI I PQLCEDL 
FS R I NDTTNDNM S YS VE VS YME I Y CERVRDLLNP KNKGNLRVRE 
HPLLG P YVEDLS KLAVTS YND I QDLMDSGNKARTVAATNMNETS 
SRSHAVFNIIFTQKRHDAETNITTEKVSKISLVDLAGSERADST 
GAKGTRLKEGANINKSLTTIjGKVISALAEMDSGPNKNKKKKKTD 
FI P YRDS VLTWLLR ENLGGNS RTAM VAALS PAD I NYDE TLS TLR 
YADRAKQIRCNAVINEDPNNKLIRELKDEVTRLRDLLYAQGLGD 
ITDMTNALVGMSPSSSLSALSSRNV 


6752 


24 


1834 


RNCVPPLGCYRSRVKFHSDIKMQYSHHCEHLLERLNKQREAGFL 
CDCTIVIGEFQFKAHRNVLASFSEYFGAIYRSTSENNVFLDQSQ 
VKADGFQKLLEFIYTGTLNLDSWNVKEIHQAADYLECVEEVVTKC 
KIKMEDFAFIANPSSTEISSITGNIELNQQTCLLTLRDYNNREK 
S EVS TDL I QANP KQGALAKKS S QTKKKKKAFNS P KTGQNKTVQ Y 
PSDILENASVELFLDANKLPTPWEQVAQINDNSELELTSWEKT 
i r fwyuivnl v i vxsJ<iu<^ivoyVJNLJUjJUiHSMSNIASVKSPYEAE 
NSGEELDQRYSKAKPMCNTCGKVFSEASSLRRHMRIHKGVKPYV 
CHLCGKAFTQCNQLKTHVRTHTGEKPYKCELCDKGFAQKCQLVF 
HSRMHHGEEKPYKCDVCNLQFATSSNLKIHARKHSGEKPYVCDR 
CGQRFAQAS TLTYHVRRHTGE KP YVCDTCG KAFAVS S S L I THSR 
KHTGEKPFICELCGNSYTDIKKLKKHKTKVHSGADKTLDSSAED 
HTLSEQDS IQKS PLSETMDVKPSDMTLPLALPLGTEDHHMLLPV 
TDTQSPTSDTLLRSTVNGYSEPQLI FLQQLY 


6753 


2 


1305 


VPSLPYPPQKWAHTEFTTSSDSETANGIAKPDPVMPGGEEKAS 
PFGIKLRRTNYSLRFNCDQQAEQKKKKRHS3TGDSADAGPPAAG 
SARGE KEMEG VALKHG P SL PQERKQAPS TRRDS AE PS S SRS VPV 
AHPGPPPASSQTPAPEHDKAAKKMPLAQKPALAPKPTSQTPPAS 
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PLS KLSRP YLVELLS RRAGRPD PE P SE P S KEDQESS DRR P PS P P 
GPEERKGQKRDEEEEATERKPASPPIiPATQQEKPSQTPEAGRKE 
KPMLQSRHSIiDGSKLTEKVETAQPLWITLALQKQKGFREQQATR 
EERKQAREAKQAEKLS KENVSVS VQ PGSSSVSRAGSLHKSTALP 
EEKRPETAVSRLERREQLKKANTLPTSVTVE I S YSS PAAPLVKE 
VSKRFSSPDDAPVSSEPAWLALAKRKAKAWSDCPLI IK 


6754 


2 


413 


FVRRRRRRLGGPEVNTMSSLHKSRIADFQDVLKEPSIALEKLRE 
LSFSGIPCEGGLRCLCWKIIiLNYLPLERASWTSILAKQRELYAQ 
FLREMIIQPGIAKANMGVSREDVTFEDHPLNPNPDSRWNTYFKD 
NEVLL 


6755 


298 


1343 


PGLQLQVALEADWFLDMPGGRRGPSRQQLSRSALPSLQTLVGGG 
CGNGTGLRNRNGSAIGLPVPPITALITPGPVRHCQIPDLPVDGS 
LLFEFLFF I YLLVALFIQY I NI YKTVWWYPYNHPAS CTS LNFHL 
I D YHLAAF I TVMLARRLVWAL I S EATKAGAAS M I HYMVL I S ARL 
VLLTLCGWVLCWT LVNL FRSHS VLNLLFLG YP FG VY VPL C C FHQ 
DSRAHLLLTD YN YWQHE AVEE S AS TVGGLAKS KDFLS LLLES L 
KEQFNNATP I PTHSCPLS PD L I RNE VECLKAD FNHR I KEVLFNS 
LFSAYYVAFLPLCFVKVSGYLTFMCFLDLCVNYINWVFLV 


6756 


180 


754 


IERALGSLPLS I PVSWGSLRTLKYQQQPLRPKVLLCQTRVQCHD 
LRSLQPQPPGLKQS FCLRVLGLQTGATTPGLRDLTC KEL I ILTE 
REAQKRKKRKEKESGMALTQGPLTFRDVAIEFSQEEWKSLDPVQ 
KAL Y WD VMLEN YRNL VFLG KDNFAL E VKI CPRVFL YFLCCLS WE 
PFHYLTETEALLTHK 


6757 


2 


459 


NSRVEAPEAHSRESQGSDAMRKHLSWWWLATVCMLLFSHLSAVQ 
TRG I KHR I KWNRKALPSTAQ I TEAQ VAENRPGAF I KQGR KLD I D 
FGAEGNR Y YEANYW Q FPDG I HYNG CS EANVTKE AFVTGC INATQ 
AANQGE FQKPDNKLHQQVL W 


6758 


1 


1008 


ASG PE L PGRR FRDRAP WLP ARLLRG VLAVW VSLS ALGPGS FCRR 
RVPSLAQLGHSEAAPSPDDVRWSRVPDRCPEERDRAWPPPPPPS 
LPPS FRRNMANNS PALTGNS QPQHQAAAAAAQQQQQ CGGGGATK 
PAVSG KQ GNVLPLWGNEKTMNLNPM ILTNILSSPYF KVQL YEL K 
TYHEWDE I YFKVTHVEP WEKGSRKTAGQTGMCGGVRGVGTGG I 
VS TAFCLL YKLFTLKLTRKQVMGL I THTDS P Y I RALGFMY I R YT 
QPPTDLWDWFESFLDDEEDLDVKAGGGCVMTIGEMLRSFLTKLE 
WFSTLFPRI P VP VQKNIDQQI KTRPRKI 


6759 


1 


513 


RKHNFHS LDGTS TRAFHPQTGL PLLS S P VPQRKTQS GC FDLDS S " 
LLHLKSFSSRSPRPCLNIEDDPDIHEKPFLSSSAPPITSLSLLG 
NFEES VLNYR FDPLG I VDG FTAE VGAS GAFCPTHLTLP VE VS FY 
SVSDDNAPS P YMGVITLESLGKRGYRVPPSGTIQWCVL 


6760 


239 


606 


VLS KKKGLSAEEKRTRMME I FSETKDVFQLKDLEKI APKEKG I T 
AMS VKEVLQS LVDDGMVDCER IGTSNYYWAFPSKALHARKHKLE 
VLE SQLS EG S QKHAS LQKS I EKAK IGRCETE ERT 


6761 


29 


1733 


E RTLRGLR EVAAPS D VADAAVSRRGRCCCCLHCTQTQ VAQDC PS 
S S S S VQRCELS LFQS LHTMTS KKLVNS VAGCADDALAGLVACNP 
NLQLLQGHRVALRSDLDSLKGRVALLSGGGSGHEPAHAGFIGKG 
MLTGVIAGAVFTSPAVGS ILAAIRAVAQAGTVGTLL I VKNYTGD 
RLNFGLAREQARAEG I PVEMVVIGDDSAFTVLKKAGRRGLCGTV 
L I HKVAG ALAEAGVGLE E I AKQ VNWTKAMGTLG VS LS S CS VPG 
S KPTFELS ADEVELGLGIHGEAGVRRI KMATADE I VKLMLDHMT 
NTTNASHVP VQ PGSS WMMVNNLGGLS FLE LG 1 1 ADATVR SLEG 
RGVKIARALVGTFMSALEMPGISLTLLLVDEPLLKLIDAETTAA 
AWPNVAAVS ITGRKRSRVAPAEPQEAPDSTAAGGSASKRMALVL 
ER VCS TLLGL E EHLNALDRAAGDGDCGTTHS RAARA I QE WLKEG 
PPPAS PAQLLS KLSVLLLEKMGGSSGALYGLFLTAAAQPLKAKT 
SliPAWSAAMDAGLEAMQKYGKAAPGDRTMLDSLWAAGQBL 
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ID 
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amino acid 
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cor re s ponding 
to first 
amino acid 
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amino acid 

nam lonno 

s> c£ 1411 c n c 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F» Phenylalanine, G=Glycine, 
HeHistidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
WaTryptophan, Y=Tyrosine, X=»Unknown, *=Stop 
Loaon, / =possiDie nucieouiae aeieuion, 
\»possible nucleotide insertion) 


6762 


3 


613 


ASTISWRLCVAGAEARRPVPVAGERAGGGAMWFMYLLSWLSLFI 

LYVFERFPTSMIGVGLFTNLVYFGLLQTFPFIMLTSPNFILSCG 
LWVNHYLAFQFFAEE Y YP FS EVLAYFTFCLW 1 1 PFAFFVSLSA 
GBNVLPSTMQPGDDWSNYFTKGKRGK 


£743 


2 


760 


SGPDFPGRRFRGCCCVkPPAGAGMELGGHWDMNSAPRLVSETAE 

ID t/TM? /^V t/'TT" T , T7'RT« , fiAT'iCflAT T(~l A D T? T . T f^T VT/ 1 ni?T nT.CP^70MTn7n 

KKyky&ItjiliAJliiADiOAVUAKK^ JjUJjrvjVisMWP 

LLS LH VKS LG AS PT VAG I VGS S YG I LQLFS S TLVGCWSD WGRR 
SSLLACILLSALGYLLLGAATNVFLFVLARVPAGIFKHTLSISR 
ALLSDWPEKERPLVIGHFNTASGVGFILGPWGGYLTELEDGF 
YLTAF I CFLVFI LNAGLVWFFPRREAKPGSTE 


6764 . 


80 


438 


LKKMDTMMLSVRNLFEQLVRR VEILS EGNEVQFIQLAKDFEDFR 
KKWQRTDHELGKYKDLI^KAETERSALDVKLKHARNQVDVEIKR 
RQRAEADCEKLERQ I QL I REMLMCDTS GS I Q 


6765 


3 


550 


ARYSRVDHFCRRRCRAVARAPRFLLQFPSGPSRHFLAACVARWL 
RGSVLVSEALSGSAMDGIVTEVAVGVKRGSDELLSGSVLSSPNS 
NMS SM WTANGNDS KKFKG EDKMDGAP SRVLHI RKLPGEVTETE 
VIALGLPFGKVTNILMLKGKNQAFLELATEEAAITNGNYYSAVT 
PHLRNQ 


6766 


1 


1287 


EGGSFKASLTWLWPLGEMKLHCEVEVISRHLPALGLRNRGKGVR 
AVLSLCQQTSRSQPPVRAFLJjISTLKDKRGTRYELRENIEQFFT 
KFVDEG KAT VRLKE P P VD I CLS KANS S SLKG FL S AMRLAHRG CN 
VDT P VS TLT P VKTSE FENFKTKMVI TS KKD YPL S KNFP YS LEHL 
QTS YCGLVRVDMRMLCLKS LRKLDLS HNH I K KLPATI GDL I HLQ 
ELNLNDNHLES Fb VALCHSTLiQKSLWS LDLb KNKIKAJjPVQFCQ 
Jj^B bKNLiKijlJDW U.Lj I Q F PC K.X GQLi lWLRr ho AARNKiiPFLPSEr 
RNLSLEYLDLFGNTFEQPKVLPVIKLQAPLTLLESSARTILHNR 
I P YGSH 1 1 PFHLCQDLDTAKI CV CGRFCLNS F IQGTTTMNLHS V 

AHTWT A/TYMT /3f3 TP & P T T PPQ T .OP VUN^ 9DT 
rUll V V U V UmIAjVJ ICi/Ur JL loir wJJU\.J| ViNOoXJ J. 


6767 


336 


919 


APMICLCSSDLQFRYKEAFLRDRGLQIGYCSVDDDPRMKHFLNV 
GRLQS DNE YKKD FAKSRSQFHS S TDQPG LLQAKRS Q Q LAS D VH Y 

PPGSYKVEMARRAAEIiANARGLGLQGAYRGAEAVEAGDHQSGEV 
NPDATE I LHVKKKKALLL 


6768 . 


2 


363 


PGSTISCYLLSEGSLPLCMQVACGEEKHRAPTMKTLRARFKKTE 
LRLS PTDLGSCPPCGPCP I PKPAARGRRQSQDWGKSDERLLQAV 
ENNDAPRVAALIARKGLVPTKLDPEGKSAFHL 


6769 


284 


396 


MSTPDFSTAENNQELANE VS CLKAMLTLMLQAMGQAD 


6770 


1 


397 


QRl^QVIWSSTMAKLHDYYKDEWKKLMTEFNYNSVMQVPRVEK 
I TLNMGVGEAIADKKLLDNAAADLAAISGQKPLITKARKSVAG F 
KIRQGYPIGCKVTLRGERMWEFFERLITIAVPRIRDFRGLSAKS 


6771 


3 


378 


APAGTLAM TG KS VKD VDR YQAVLANLLLEEDNKFCADCQS KG PR 
WAS WNIGVFTC I RCAGIHHNT /3VH I SRVKS VNLDOWTOROI OOM 
QEMGNGKANRLYEAYLPETFRRPQIDPYLFWSNLEG 


6772 


1 


1400 


aaaflqgmt vng fintvitsl \ err ydlhs yqs gl ias s yd i aa 
clcltfvsyfggsg\hkprwlgwgr\vlmgtgslvfalphftag 
p**gwkldagvrtcpanpr\pvcag\htsglsryqlvfmlgqfl 
hg vgatp lytlgvt yldenvks scspiyiai fytaailgpaagy 
liggallni ytemgrrtelttesplwvgawwvgflgsgaaafft 
avp ilgyprqlpgs qryavmraaemhqlkds srgeasnpdfgkt 
irdlplsiwlllknptfillclagateatlitgmstfspkfles 
q fslsaseaatlfg ylwpaggggtflggffvnklrlrgsavi k 
fclfctws llg i l vfslhcp s vpmag vtas yggs ll peghlnl 
tapcnaacs cqpehyspvcx3sdgldtyfslchagcpaatetnvdg 
qkvyrdcs c i pqnls s gfghatagkcts t 
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amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. F=Phenvla.lanine G=Glvcinp 
H=Histidine, I=Isoleucine, K=Lysine, | 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W*»Tryptophan, Y=*Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6773 


1 


630 


P WEAPKEHKYKAE EHTVVLT VTG3 PCH FP FQ YHRQL YHKCTHKG 
RPGPQPWCATTPNFDQDQRWGYCIiEPKKVKDHCSKHSPCQKGGT 
C VNMPS G PHCLC PQHLTGNH CQKE KCFE P QLLRFFHKNE I W YRT 
EQAAVARCQCKGPDAHCQRLASQACRTNPCLHGGRCLEVBGHRL 
CHCP VG YTGPFCDVGE * GSGASRRPAPRWDGLAR 


6774 


146 


389 


LTELSDQQYFLFFILSS/WVPTFLSMDVDGRVIKADSFSKIISS 
GLRIGFLTGPKPLIERVILHIQVSTLHPSTFNQLMISQ 


6775 


104 


614 


TPDQOTiT5\J r T. r riiDf2/ , 71?Dl\DCDnT MTT \7T »T CUD T T BluTXTC'' 

l UfoULin V Jj 1 AKIjVj KKAfb Fy Jj W 1 Li V LJ\b 1 hit K.WRo HR I LoRMNS 

GRPETMENLPALYTI FQGEVAMVTDYGAF I KI PGCRKQGLVHRT 
HMSSCRVDKPSEIVDVGDKVWKLIGREMKNDRIKVSLSMKVVN 
QGTGKDLD PNNV \SLS KKRGGGDPS R I TLGRRS PLRLS 


6776 


3 


1108 


HERHERHEGALSQDALLRISIPLDSNMRPEKCRRFVHPQWQLLH 
LNGTFPNTSDADMEPCVDGWVYDRISFSSTIVTEWDLVCDSQSL 
I o v>ii\r v c rtfvjlrir'l Vwl LiUtxnJjbDKr(jK.K.p VIjRWL Y JjQ V AIVGT 
CAAIiAPTFLIYCSLRFLSGIAAMSLITNTIMLIAEWATHRFQAM 
G I TLGMC P S G I AFMTLAGLAFAI RD WH I LQL WS V P YF V I FLTS 
SWLLES ARWL 1 1 NNKPE EGLKELRKAAHRSGMKNARDTLTLEI L 

YFGLNLHG /LKHLGNNVFLLQTLFGAV/ TP PGQLVLHLGHWGSG 
RVSS RGRVNCLGLFVLQVW 


Sill 


779 


63 


CFFHGPAWRDCEVRATFAKKQGQSGIISCIAFSPAQPLYACGSY 
GRS LGLYAWDDGS PLALLGGHQGG I THL C FHP DGNRFFS GARKD 
AELLCWDLROS G YP LWS LGPRVTTNOR T YFHT ,n DTfifiPT ,VQ n C T 
SGAVS VWDTDG PGNDGKPEPVLS FLPQKDCTNGVSLHPSLPLLG 
HCL P VS VC FLS PTE SGGRRRGAG P SLGS PRRHVHLE CRLQL WW C 
GGGARLQHP* *SPRARKGR 


6778 


311 


805 


IOSITDESRfi^TRPKNPA'NITRT.PT 1 'NVP\ PRTAr:n<;Tr /TTPCJDCT??? 

VQADPRIRSASPKCPTSSPFPKGRSPEGEGET\DPEKVHFHPGP 
KDKSVAEKN\KGP\SPVSSEGIKDFFSMKPEWENLNQSNVRRMH 
T\AVRLNEVIVKKSRDAKLVLLNMPGPPRNRNGDENY 


6779 


2 


535 


RALRRQPRLLAANG IEPES MAI S EP I KGSRKPC VNKEELALKKP 
MAKCAWKGPREP PQDARAEAES PGGASESDQDGGHES PPKKKAV 
AWVS AKNPAPMR KKKKVSLGP VS YVLVDS EDGR KKP VMP KKGPG 
S RREAS DQKAPRGQQ PAE ATAS T S RG PKAKPEGS P RRATNE S RK 
V 


6780 


3 


403 


HEVNDNKPEININLMSPGKEEISYIFEGDPIDTFVALVRVQDKD 

LTVIAEDRGTPSLSTVKHFTVQINDINDNPPHFQRSRYEFVISE 
K 


6781 


1 


1269 


APTRPVFPTLQDLSSSKEPSNSLNLPHSNELCSSLVHPELSEVS 
SNVAPS I PPVMSRPVS S SS ISTPLP PNQITVFVTSNPITTSANT 
SAALPTHLQSALMSTWTMPNAGSKVMVSEGQSAAQSNARPQFI 
TPWINSSSIIQVMK<3SQPSTIPAAPLTTNSGLMPPSVAVVGPL 
HIPQNIKFSSAPVPPNALSSSPAPNIQTGRPLVLSSRATPVQLP 
S PP CTS S P WPSHPPVQQ VKELNPDEAS PQ VNTS ADQNTLPS S Q 
STTMVSPLLTNSPGSSGNRRSPVSSSKGKGKVDKIGQILLTKAC 
KKVTGS LEKG EE Q YGADGE TEGQGIjDTTAPGLMGT EQLS TELD S 
KTPTPPAPTLLKMTSSPVGPGTASAGPSLPGGALPTSVRSIVTT 
LVPSELI SAVPTTKSNHGG IASESLAG 


6782 


3 


1327 


RKPTVI R I PAKPGKCLHED PQS P P PLPAEKP IGNT FS TVSG KL S 
NVERTRNLESNHPGQTGGFVRVPPRLPPRPVNGKTIPTQQPPTK 
VPPERPPPPKLSATRRSNKKLPFNRSSSDMDLQKKQSNLATGLS 
KAKSQVFKNQDPVLPPRPKPGHPLYSKYMLSVPHGIANEDIVSQ 
NPGELSCKRGDVLVMLKQTENNYLECQKGEDTGRVHLSQMKIjIT 
PLDEHLRSRPNPFSPPKAPSHAQKPVDSGAPHAWLHDFPAEQV 
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Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DDLNLTSGEIVYLLEKIDTDWYRGNCRNQIGIFPANYVKVIIDI 
PEGGNGKRECVSSHCVKGSRCVARFEYIGEQKDELSFSEGEIII 
LKEYVNEEWARGEVRGRTGIFPLNFVEPVEDYPTSGANVLSTKV 
PLKTKKEDSGSNSQVNSLPAEWCEALHSFTAETSDDLSFKRGDR 
I 


6783 


3 


1750 


S YHHHHAQQSAAASPNLTASQKTVTTTSMI TTKTLPLVLKAATA 
TMPAS WGQRP T I AMVTA INS QKAVLS TDVQNT P VNLQTS S KVT 
GPGAEAVQ I VAKNTVTLQVQATPPQP I KVPQF I P P PRLTPRPNF 
LPQVRPKPVAQNNIPIAPAPPPMLAAPQLIQRPVMLTKFTPTTL 
PTSQNS IHP VRWNGQTATIAKTFPMAQLTS IVI ATPGTRLAGP 
QT VQLS KPSLE KQT VKSHTETDEKQTESRT IT P PAAP KP KR E EN 
PQ KLAFMVSLGLVTHDHLE E I QS KRQERKRRTTANP VYS GAVFE 
PE RKKS AVTYIjNS TMHPGTRKRGRP P KYNAVLGFGALTP T S PQS 
SH PD S P ENEKTE TT FTF P AP VQP VS LPS PTSTDGD I HED F CS VC 
RKSGQLLMCDTCSRVYHLDCLDPPLKTIPKGMWICPROQDQMLK 
KEEAIPWPGTLAIVHSYIAYKAAKEEEKQKLLKWSSDLKQEREQ 
LEQKOTQLSNSISKCMEMKNTILARQKEMHSSLEKVKQLIRLIH 
GI DLS KPVDSEATVGAI SNG PDCTPPANAATSTPAPS PSS QSCT 
ANCNQGEETK 


6784 


3 


1750 


SYHHHHAQQSAAAS PNLTASQKTVTTTSMITTKTLPLVLKAATA 
TMPASWGQRPTIAMVTAINSQKAVLSTDVQNTPVNLQTSSKVT 
GPGAEAVQ I VAKNT VTLQVQATPPQ P I KVPQF I P P P RLTP R PNF 
LPQVRPKPVAQNNIPIAPAPPPMLAAPQLIQRPVMLTKFTPTTL 
PTSQNS I H P VR WNGQTAT I AKT FP MAQLTS I V I AT PGTRLAG P 
QTVQLS KP S LE KQTVKSHTETDE KQTE SRT I TP PAAP KP KREEN 
PQKLAFMVSLGLVTHDHLEE IQSKRQERKRRTTANPVYSGAVFE 
P E RKKS AVTYLNS TMHPGTRKRGRP PKYNAVLGFGALTPTS PQS 
SH PDS P ENE KTETT FT F PAP VQP VS LPS PTS TDGD I HED FCS VC 
RKSGQLLMCDTCSRVYHLDCLDPPLKTIPKGMWICPRCQDQMLK 
KEEAIPWPGTLAIVHSYIAYKAAKEEEKQKLLKWSSDLKQEREQ 
LEQKVKQLSNS I S KCMEMKNT I LARQKEMHS SLE KVKQL I RL I H 
GIDLSKPVDSEATVGAISNGPDCTPPANAATSTPAPSPSSQSCT 
ANCNQGEETK 


6785 


1 


528 


LGNTVLH YCSM YS KPE CLKLLLRS KP TVDI VNQAGE TALD I AKR 
LKATQCEDLLSQAKSGKFNPHVHVEYEWNLRQEE I DES DDDLDD 
KPSPVKKERSPRPQSFCHSSSISPQDKLALPGFSTPRDKQRLSY 
GAFTNQIFVSTSTDSPTSPTTEAPPLPPRNAGKGPTGPPITPHR 


6786 


1820 


1397 


RS PKVLVLAPTRELANHVSRDFKDI \TRKLTVARF YGGTS YQSQ 
INHIRNGIDILVGTPGRIKDHLQSGRLDLSKLRHWLDEVDQML 
DLGFAEQVEDIIHESYKTDSEDNPQTLLFSATCPQWVYTVA\KK 
YMKSRYE Q VDLDGKMTQKAATTVEHLA I QCHWSQRPAV I GDVLQ 
VYSGSEGRAIIFCETKKNVTEMAMNPHIKQNAQCLHGDIAQSQR 
E ITLKGFREGS FKVLVATNVAARGLDI PEVDLVIQSS P PQDVES 
Y I HRSGRTGRAGRTG I C I CF YQ PRERGQLR YVEQKAG I TFKRVG 
VPSTMDLVKS KS MDAI RS LAS VS YAAVDFFR PS AQRL I E EKGAV 
DALAAALAH I SGAS S FE PRS L I TS DKGFVTMTLES LE E I QDVS C 
AWKELNRKLSSNAVSQITRMCLLKGNMGVCFDVPTTESERLQAE 
WHDSDWILSVPAKLPEIEEYYDGNTSSNSRQRSGWSSGRSGRSG 
RS GGRSGGRS GRQS RQGSRS GS RQDGRRRSGNRNRSRSGGHKRS 
FD * VF YHLVD FLS D FLVDS VYLTGRQ I DHLTGLTGL I DHLTSHS 
SVWN 


6787 


2646 


2270 • 


PSSFPKNVPLEELEEPPK*KRSGLGSLTPKSQIQNGP*PQTFFF 
FELGS PSGVI SAHCNLRLLGS SDS PAPAS RVAG 1 1 GTCHHAWL I 
LVFLVEMGFHHVGQAGLKLLTL\VIHPPWPPKVLGLQT 


6788 


16 


936 


GGWDLR\DMLAVSVLAAVRGGR/ATVRRVRESNVLHEKSKGKT 
REGAEDKMTSGDVLSNRKMFYLLKTAFPSVQINTEEHVD\ELDQ 
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sequence 


Predicted end 
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Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, QeGlutamine, RsArginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








BVILWGS * DS *GYPKGK* LLPKE VPSR/RVLLSGLTPLDATQE \ 
FTEDLS k\ YVTTMVCVAVNG KPMLG VI hkp fs E YTAWAMVDGGS 
NVKARSS YKfEKTPR I WSRSHSGMVKQVALQTFGNQTTI I PAGG 
AGYKVIjALIjDVPDKSQEKADLYIHVTYIKKWDI CAGNAILKALG 
GHMTTLSG EE I S YTGSDG I EGGLLAS I RMNHQAL VRKLPDLE KT 
GHK 


6789 


2 


678 


GNGINVLKIAPESAIKFMAYEQIKRLVW**PGDS*GF/YERIiYA 
GSLAGAIAQSSIYPMEVLKTRMALRKTGQYSGMLDCARRILARE 
GVAAFYKGYVPNMLGI I PYAGIDLAVYETLKNAWLQHYAVNSAD 
PGVFVLLACGTMS STCGQLAS YP LALVRTRMQAQAS I EGAPE VT 
MS SL FKH I LR TEGAFGL YRGLAPNFMKVI PAVS I S YWYENLKI 
TLGVQSR 


6790 


2 


4068 


AP PAGR RRMQAAPRAG CGAALLLW I VS S CLCRAWTAPS TS QKCD 
EPLVSGLPHVAFSSSSSISGSYSPGYAKINKRGGAGGWSPSDSD 
HYQWLQVDFGNRKQ I SAIATQGRYSS SDWVTQYRML YSDTGRNW 
KP YHQDGN I WAFPGN I NSDG WRHELQHP 1 1 AR YVR I VPLDWNG 
EGRIGLRIEVYGCSYWADVINFDGHWLPYRFRNKKMXTLKDVI 
ALNFKTSESEGVILHGEGQQGDYITLBLKKAKLVLSLNLGSNQL 
G P I YGHTS VMTGS LLDDHHWHS VVI E RQGRS INLTLDRS MQHFR 
TNGE FD YLDLD YE I TFGG I P FSGKPS S S S RKNFKG CME S INYNG 
VN ITDLARR KKLE PSNVGNLS FS C VE P YT VP VF FNATS YLE VPG 
RLNQDLFS VS FQFRTWNPNGLLVFSHFADNIiGNVE IDLTES KVG 
VH INI TQTKMSQ I D I S S GSGLNDGQWHE VRFLAKE NFAI LT IDG 
DEAS AVRTN S PLQ VKTGE KY FFGG FLNQMNNS SHS VLQPS FQG C 
MQLIQVDDQLVNLYEVAQRKPGSFANVSIDMCAIIDRCVPNHCE 
HGGKCSQTWDSFKCTCDETGYSGATCHNSIYEPSCEAYKHLGQT 
SNYYWIDPDGSGPLGPIiKVYCNMTEDKVWTIVSHDLQMQTPWG 
YljlPEKYSVTQLVYSASMDQISAITDSAEYCEQYVSYFCKMSRLL 
NTPDGSPYTWWVGKANEKHYYWGGSGPGIQKCACGIERNCTDPK 
Y YCNCDAD YKQWRKDAGFLS YKDHLPVS QVWGDTDRQGSEAKI* 
S VGPLRCQGDRNYWNAASFPNPS S YLHFSTFQGETSADI SFYFK 
TLTP WGVFLENMGKEDF I KLELKS ATE VS FS FDVGNGP VE I WR 
SPTPLNDDQWHRVTAERNVKQASLQVDRLPQQIRKAPTEGHTRL 
ELYSQLFVGGAGGQQGFLGCIRSLRMNGVTLDLEERAKVTSGFI 
SG CS GHCTS YGTNCENGGKCLE RYHG YS CDC SNTAYDGTFCNKD 
VGAFFEEGMWLRYNFQAPATNARDSSSRVDNAPDQQNSHPDIiAQ 
EEIRFSFSTTKAPCILLYISSFTTDFLAVLVKPTGSLQIRYNLG 
GTREPYNIDVDHRNMANGQPHSVNITRHEKTIFLKLDHYPSVSY 
HLPSSSDTLFNSPKSLFLGKVIETGKIDQEIHKYNTPGFTGCLS 
RVQ FNQ I AP LKAALRQTNASAHVH I QGELVESNCGAS PLTLS PM 
SSATDPWHLDHLDSASADFPYNPGQGQAIRJMGVNRNSAI IGGVI 
A\WIFTPSLCTP\VLP*SR*HVSPHKGTLPIPNEAKGAGSRQK 
KPGRRPSMNNDPPTSQRPIDESKKEWPHLRGGYLAMG 


6791 


1801 


1193 


TGHEGAKGEKGDKGDLGPRGERGQHGPKGEKGYPGIPPEL/PGW 
SAW* SWLTAASTKVQAILLP QPLE * LGLQIAFMASLATHFSNQ 
NSG 1 1 FSS VETNIGNFFDVMTGRFGAPVSGVYFFTFSMMKHEDV 
EEVYVYLMHNGMVFSMYSYEMKGKSDTSSNHAVLKLAKGDEVW 
LRMGNGALHGDHQRFS TFAGFLL FE TK 


6792 


33 


1073 


VRHTNWGVDMYLFSLGSES PKGAI GH I VS TB KT I LAVERNKVLL 
PPLWNRTFSWGFDDFSCCLGSYGSDKVLMTFENLAAWGRCLCAV 
CPSPTTIVTSGTSTWCVWELSMTKGRPRGLRLRQALYGHTQAV 
TCLAASVTFSLLVSGSQDCTCILWDLDHLTHVTRLPAHREGISA 
I TI S DVSGT I VS CAGAHLSLWNVNGQPLAS I TTAWGPEGAITCC 
CLMEGPAWDTSQIIITGSQDGMVRVWKT/VGCEDVCSWTASRRG 
APGSASKPKRPQVGEEPGLESRAGR*HCFDREAQQNQP\PVTAL 
AVS RNHTKLLVGDERGR I FCWSADG* EERGSRGSGTTVPG 
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sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Qlycine, 
H-Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6793 


2340 


805 


GRKEANY \ YGS LTQAGTVSLGLDAEGQEVFVP FSAVLPMVAPND 
L VFDG WD I S S LNLAEAMRRAKVLDWGLQEQ L WPHME ALR P RPS V 
YIPEFIAANQSARADNLIPGSRAQQLEQIRRDIRDFRSSAGLDK 
VI VLWTANTERFCEVI PGLNDTAENLLRT I ELGLEVS PSTLFAV 
ASILEGCAFLNGSPQNTLVPGALEIiAWQHRVFVGGDDFKSGQTK 
VKS VLVDFL I GSGLKTMS I VSYNHLGNNDGENLS APLQFRS KEV 
S KS NVVDDM VQ SNPVL YTP GEEP DHCVVI KYVP YVGDS KRAIaDE 
YTS ELMLGGTNTLVLHNTCEDS LLAAP IMLDLALLTE LCQR VS F 
CTDMDP EPQT FHP VLS LLS FLFKAP LVPPGS P WNAL FRQRSC I 
ENILRACVGLPPQNHMLLEHKMERPGPSLKRVGPVAATYPMLNK 
KG P VPAATNGCTGDANGHLQEEP PM PTT *G PGHTVSRLFLPAAP 
HD PTLKAPTNKGRCHFS P P S TWG S WGL 


6794 


169 


1349 


DDVKRKPEASAH* EKPGPPSRPGVRGGRERAGGRGSHGARS CR \ 
EPAP PAPAPPEDHPDEEMGFTID I KS FLKPGEKT YTQRCRLFVG 
NLPTDITEEDFKRLFERYGEPSEVFlNRDRGFGFIRLESRTIiAE 
IAKAELDGT I LKSRPLRI RFATHGAALTVKNLSP WSNELLEQA 
FSQFGPVEKAWWDDRGRATGKGFVEFAAKPPARKALERCGDG 
AFLLTTTPR P V I VEPMEQ FDDEDGL PE KLMQKTQQ YHKERE QP P 
RFAQPGTFE FE YASRWKALDEMEKQQREQVDRNI REAKEKLEAE 
MEAARH EHQLMLMRQDLMRRQEELRRLEELRNQE LQKRKQ I QLR 
HEEEHRRREEEMIRHREQEELRRQQEGFKPNYMENYVCHFLR 


6795 


1740 


1010 


GPRRQTQVRDHELDSF*DWAAQETDCAQNSGERIi*KGV/LENFS 
TMS KSAVKIS LDLLSNPLCEQDQDLLNMVTALDTAMKRMDAFNQ 
EKVNQ I QKTVI EPLKKFGS VFPSLNMAVKRREQALQDYRRLQAK 
VE K YEE KE KTG P VLAKLHQAREELR P VREDFE AKNRQLLEEM PR 
FYGSRLDYFQPS FESLIRAQWYYSEMHKI FGDLSHQLDQPGHS 
DEQR ERENEAKLS E LRALS IVADD 


6796 


48 


683 


GKE IQI PTIKLAWLLFGLE* PVGALGKGWSF* * SHVALGQLGW 
LTRAVRSSWRWELCVSAQEWSQRSA* SS PSP VGACPSLNPPET 
S VQEGRD CWQR * L PRLFS ALVGQPG CW PQGAP P E RC V* PGRC KW 
HLQSQVLR* ERRRCCRCLPRFA* GWRRRHQRLGLG I HPAPLGST 
S PP HPEGNSQQ CRR * G WAAELRLPS S WL * GKLG C * 


6797 I 


1620 


211 


TERMTPSQPTRGSSCTRFSSMLWTSTWRCLTCHWAGMRMSWGV 
TLGPMAQGLLSASGTTTEATWTRPTTHLTLIRWWLLTASRVDPP 
ERPPPPPSDDLTLLESSSSYKNL/DAQIPQ/DWSMSPSTSG*RP 
LTSRASS IMRSRTAIPSAS *SRLTTKHTVGGSPSAWRPRPTSRS 
VSTPVS S STETTASGS CLTWWS SSPAPCPSS SAPAHS FEAS CCK 
TSLWGS CGGSGDGSSAOGSGWNLSMAGTS CS SPAMCS PSRAPS * 
RSASRPRTWRATTSAASSWAPRRCWCGWA*SAT* PSSTTTI SSS 
PHCGWPCPASCAS AAAWLSSTWATAS VAGSCWGP IM * S SAHSP W 
CLSACSRSSMGTTCL*RSPP\SGASRAAAAWCGSSPSSTFTPSS 
ASSSTWCSASSSRSS PAPTTPSS I PAAQAQRRAS CRPTSHSART 
APPPAS S AAGAAR PAAFSAAAEGTPRRS I RC W 


6798 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEYIIGFCDQINKELEG*VS 
ALWGQLRGSGLGRGTTMAKEGQPGS PRLS ALECVLLVPQ\ PQ I A 
VRDLAHK I QS PQE WEALQALT YLGDR VS EKVKTKV I ELLYS WTM 
AliPEEAKI KDAYHMLKRQGIVQSDPP I PVDRTLIPSPPPRPKNP 
VFDDEEKSKLLAKLLKSKNPDDLQEANKLIKSMVREDEARIQKV 
TKRLHTLEEVNNNVRLLSEMLLHYSQEDSSDGDRELMKELFDQC 
ENKRRTL FKLASETEDNDNS LGD I LQAS DNLS R V I NS Y KT 1 1 EG 
QVINGEVATLTLPDSEGNSQCSNQGTLIDLASLDTTNSIiSSVLA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAL 
S WLDEE LLCLGLAD PAPNVP P KES AGNS Q WHLLQREQS DLD FFS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APSAGSSLFSTGVAPALAPKVEPAVPGHHGLALGNSALHHLDAL 
D QLLEEAKVTSGLVKPTTS PL I PTTTPAR PLL P FS TGPGS PLFQ 
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ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 1 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F=Phenylalanine, GsGlycine, 
H=Histidine, l=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine , 
S^Serine, T=Threonine , V»Valine, 
W=Tryptophan, Y«Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLSFQSQGSPPKGPELSLASIHVPLESIKPSSALPVTAYDKNGF 
RILFHFAKECPPGRPDVLWWSMLNTAPLPVKSIVLQAAVPKS 
MKVKLQPPS GTELS P FS P I Q PPAA I TQ VMLLANP LKE KVRLR YK 
LTFALGEQLS TEVGE VDQFPPVEQWGNL 


6799 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEYIIGFCDQINKELEG*VS 
ALWGQLRGSGLGRGTTMAKEGQPGSPRLSALECVLLVPQ\PQIA 
VRL LAHKI Q S PQE WE ALQALT Y LGDRVS E KVKTKV I E LL YS WTM 
ALPEEAKIKDAYHMLKRQGIVQSDPPIPVDRTLIPSPPPRPKNP 
VFDDEEKSKXIJUCLLKSKNPDDLQEANKLIKSMVREDEARIQKV 
T KRLHTLEE VNNNVRLLSEMLLH YS QEDS S DGDRELMKE LF DQC 
ENKRRTLFKLASETEDNDNSLGDILQASDNLSRVINSYKTIIEG 
QVINGEVATLTLPDSEGNSQCSNQGTLIDLAELDTTNSLSSVLA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAL 
SWLDEELLCLGLADPAPNVPPKESAGNSQWHLLQREQSDLDFFS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
AP SAG S S L FS TGVAPALAPKVE PAVPGHHGLALGNS ALHHLDAL 
DQLLEEAKVTSGLVKPTTSPLIPTTTPARPLLPFSTGPGSPLFQ 
PLS FQS QGS P PKG PELS LAS I HVPLES I KPSS ALPVTAYDKNGF 
RILFHFAKECPPGRPDVLWWSMLNTAPLPVKSIVLQAAVPKS 
MKVKLQPPSGTELSPFSPIQPPAAITQVMLLANPLKE KVRLR YK 
LTFALGEQLS TEVGE VDQFP PVE QWGNL 


6800 


404 


1646 


RRS PSTGLS P VPQPS S PS LSDYS I PWS LLLSGTIAWATPGK* AG 
* PQAW* LGLAPAIAFI /GLTRGRKQNKEKMAEGGSGDVDDAGDC 
SGAR YNDWSDDDDDSNESKS I VWYPPWAR IGTEAGTRARARARA 
RATRARRAVQKRASPNSDDTVLSPQELQKVLCLVEMSEKPYILE 
AALIALGNNAAYAFNRDIIRDLGGLPIVAKILNTRDPIVKEKAL 
IVLNNLSVNAENQRRLKVYMNQVCDDTITSRLNSSVQLAGLRLL 
TNMTVTNEYQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAE 
NPAMTRELLRAQVPSSLG\SLFNKKENKEVILKLLVIFENINDN 
FKWEEJNEPTQNQFGEGSLFFFLKE FQVCADKVLG I ESHHDFLVK 
VKVGKFMAKLAEHMFPKSQE 


6801 


2 


1755 


SAEEFESQQASVTMHDVDAESFEVLVDYCYTGRVSLSEANVERL 
YAASDMLQLEYVREACAS FLARRLDLTNCTA I LKFADAFGHRKL 
RSQAQSYIAQNFKQLSHMGSIREETLADLTLAQLLAVLRLDSLD 
VESEQTVCHVAVQWLEAAPKERGPSAAEVFKCVRWMHFTEEDQD 
YL EGLLTKP I VKK YCLD V IEGALQMRYGDLL YKSLVP VPNS S S S 
/R* QQQLS C I CSRKS T PETG YVCQGDGDLLWT PQRS LS \ RYDPY 
SGD I YTMPS PLTSFAHTKTVTS SAVCVSPDHD I YLAAQPRKDLW 
VYKPAQNS WQQLADRL LCREGMD VAYLNG YI Y ILGGRDP I TG VK 
LKEVE C YS VQRNQWALVAP VPHS FYS FEL I WQNYL YAVNS KRM 
LC YDP SHNMWLNCAS L KRS DFQ EACVFNDE I YCI CD I P VMKVYN 
PARGEWRRISNIPLDSETHNYQIVNHDQKLLLITSTTPQWKKNR 
VTVYEYDTREDQW INI GTMLGLLQFDSGF I CLCARVYPS CLE PG 
QSFITEEDDARSESSTEWDLDGFSELDSESGSSSSFSDDEVWVQ 
VAPQRNAQDQQGSL 


6802 


157 


1341 


ETFPLFFFLLSKTPGKTASMAHFVQGTSRMIAAESSTEHKECAE 
PSTRKNIiMNSLEQKIRCLEKQRKELLEVNQQWDQQFRSMKELYE 
RKVAEL KTKLDAAER FLS TREKDPHQRQRKDDRQREDDRQRDLT 
RDRLQREE KE KERLNE ELHELKE ENKLLKGKNTLANKE KEHYE C 
EIKRLNKALQDALNIKCSFSEDCLRKSRVEFCHEEMRTEMEVLK 
QQVQIYEEDFKKERSDRERLNQEKEELQQ3NETSQSQLNRLNSQ 
IKACQMEKEKLEKQLKQMYCPPCNCGLVFHLQDPWVPTGPGAVQ 
KQREHPPDYQWYALDQLPPDVQHKAN/DWCLAPPPVCCQAG/PR 
TPGL K* S S CLWL PKC * NFR FI LS KE S PSVE VHTNRERQQATRER 
G 


6803 


1 


2203 


KLSGRPYRHMGVLGTSKLYDIRKTIFTFTPQFIDQQQFYLALDN 
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ID 
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Predicted 
beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine f D=Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S^Serine, T=Threonine, V»Valine, 
W»Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KMIVEMLRTDLSYLCSRWRMTGQPTITFPISHSMLDEDGTSLNS 
S I LAALRKMQ DG YFGGARVQTGKLS E FLTTS CCTHL S FMDPGP E 
GKLYSEDYDDNYDYLESGNWMNDYDSTSHARCGDEVARYLDHLL 
AHTAPHP KLAPTS QKGGLDRFQAAVQTT CDLMSLVT KAKELHVQ 
NVHMYLPTKLFQASRPSFNLLDSPHPRQENQVPSVRVEIHLPRD 
Q SGE VD FKALVLQLKETS SLQE QADI L YMLYTMKG PDWNTEL YN 
ERSATVRELLTELYGKVGEIRHWGLIRYISGILRKKVEAIjDEAC 
TDLLSHQKHLTVGLPPEPREKTISAPLPYEALTQLIDEASEGDM 
SISILTQEIMVYLAMYMRTQPGLFAEMFRLRIGLIIQVMATELA 
HSLRCSAEEATEGLMNLSPSAMKNLLHHILSGKEFGVERK/SVR 
PTDSNVSPAI S IHE IGAVGATKTERTGIMQLKSEIKQVEFRRLS 
ISAESQSPGTSMTPSSGSFPSAYDQQSSKDSRQGQWQRRRRLDG 
ALNRVP VGF YQKVWKVLQ KCHGL S VEG FVLP S STTREMT PGE I K 
FSVHVES\VLNVLLRPEYRQLLVEAILVLTMLADIEIHSIGSII 
AVEKI VHI ANDLFLQEQKTLGP \DDTMLAKDPASG\ I CTLR\ YD 
SAPSGRFGTMTYLS \RAA\ATYVQEFLP \HS I CAMQ 


6804 


1 


951 


GSPGKKEEKAKNKESLCMENSSNSSSDEDEEETKAKMTPTKKYN 
GLEEKRKSLRTTGFYSGFSEVAEKRIKLLNNSDERLQNSRAKDR 
KDWSSIQGQWPKKTLKELFSDSDTEAAASPPHPAPEEGVAEES 
LQT VAE EE S CS PS VE L E KP PP VNVDS KP I EE KTVE VNDR KAE FP 
SSGSNFSA*IPLPYLHLNRLHQSL*QKGSRQQSSVTVSEPLAPN 
QEE VRS I KS ETDST I EVDS VAGELQDLQS ERE * LASRF * CQCEL 
KQ* * SARTRTS *KSLYRSEKSERCSGRRKFI KKAEKKP* SNSGK 
QQKEGKRHK 


6805 


1539 


206 


RQPDLKYFGKSFDVSVSESSSLLSNDLPKFADGIKARNRNQNYL 
VPS P VLRI LDHTAFSTEKSAD I VI CDEE CDS PES VNQQTQEESP 
I E VHTAED V P IAVE VHAI S ED YD I E TENNS S ES LQDQTDEE P PA 
KLCKIIoDKSQAIiNVTAQQKWPLLRANSSGLYKCELCEFNSKYFS 
DLKQHMILKHKRTDSNVCRVCKESFSTNMLLIEHAKLHEEDPYI 
CKYCD YKTV I FENLS QH I ADTH FS DHL Y W CEQCD VQFS S S S ELY 
LHFQEHSCDEQ YLCQ FCEHETND P EDLHSHWNEHACKL IELSD 
KYNWGEHGQYSLLSKITFDKCKNFFVCQVCGFRSRLHTNVNRHV 
A I EHTKI FPHVCDDCGKGFS SMLE \ I AKHLNSHLSEG I YLCQYW 
EYSTGQIEDLKIHLDFKHSADLPHKCSDCLMRFGNERELISHLP 
VHETT 


6806 


272 


3794 


VALCFPNSDPVMFMDAFYGCLLAELGPVPIEVPLTRKDAGSQQV 
GFLLGS CGVFLALTTDACQKGLPKAQTGE VAAFKG WPPL S WLVI 
DGKHLAKPPKDWHPLAQDTGTGTAY I EYKTS KEGSTVGVTVSHA 
SLLAQCRALTQACGYSEAETLTNVLDFKRDAGLWHGVLTSVMNR 
MHWS VP YALMKANPLS W I Q KVC F YKARAAL VKS RDMH WS LLAQ 
RGQRDVSLSSLRMLIVADGANPWSISSCDAFLNVFQSRGLRPEV 
ICPCASSPEALTVAIRRPPDLGGPPPRKAVLSMNGLSYGVIRVD 
TEEKLS VLTVQDVGQVMPGANVCWKLEGTP YLCKTDE VGE I CV 
SSSATGTAYYGLLGITKNVFEAVPVTTGGAPIFDRPFTRTGLLG 
F IG PDHL VF I VG KLDG LM VTGVRRHNADDWATALAVE PMKFVY 
RGR I AVFS VTVLHDDR I VLVAEQR P DAS E EDS FQ WMSRVLQAI D 
S IHQVGVYCLALVPANTLPKAPLGGIHI SETKQRFLEGTLHPCN 
VLMCPHTCVTNLP KPRQKQP E VG P ASM I VGNLVAG KR IAQASGR 
ELAHLEDSDQARKFLFLADVLQWRAHTTPDHPLFLLLNAKGTVT 
STATCVQLHKRAERVAAALMEKGRLS VGDHVALVY PPGVDL IAA 
FYGCLYCGCVPVTVRPPHPQNLGTTLPTVKMIVEVSKSACVLTT 
QAVTRIiLRSKEAAAAVDIRTWPTILDTDDIPKKKIASVFRPPSP 
DVLAYLDFSVSTTGILAGVKMSHAATSALCRSIKLQCELYPSRQ 
IAICLDPYCGLGFALWCLCSVYSGHQSVLVPPLELESNVSLWLS 
AVS Q YKARVT FCC YS VMEMCT KGLGAQTGVLRM KG VNLS C VRTC 
MVVAEERP\R1ALTQSFSKLFKDLGLPARAVSTTFGCRVNVAIC 
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ID 
NO: 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine / D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y«Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQGTAGPDPTTVYVDMRALRHDRVRLVERGSPHSLPLMESGKIL 
PGVKVI IAHTETKGPLGDSHLGE I WVSS PHNATGYYTVYGEEAL 
HADHFSARLS FGDTQT IWARTGYLGFLRRTELTDAS GGRHDALY 
WGSLDETLELRGMRYHPIDIETSVIRAHRS IAECAVFTWTNLL 
WWELDGLEQDALDL VALVTNWLEEHYLWGWVI VDPGVI P 
INSRGEKQRMHLRDGFLADQLDPIYVAYNM 


6807 


1444 


606 


VGHDTVHAMFTCFPKCLGFSPPVNVTVSPRSEESHTTTVSGGNG 
. S VFQAGPQLQALANLEARRGS IGAALSSRDVSGLPVYAQSGEPR 
RLTQAQVAAFPGENALEHSSDQDTWDSLRSPGFCSPLSSGGGAE 
SLPPGGPGHAEAGHLGKVCDFHLNHQQPSPTSVLPTEVAAPPLE 
KILSVDSVAVDCAYRTVPKPGPQPGPHGSLLTEGCLRSLSGDLN 
RFPCGMEVHSGQRELESWAVGEAMA\LKFPMGAMSYCLRDRSR 
FLFRLPMGLSCPLQVQ 


6808 


2063 


737 


GVGSGAASALARSRPLASRLSSRRRTRAPRSGAMQRLAMDLiRML 
SRELSLYLEHQVRVGFFGSGVGLSLILGFSVAYAFYYLSSIAKK 
PQLVTGGESFSRFLQDHCPWTETYYPTVWCWEGRGQTLLRPF\ 
ITS KPPVQ YRNELIKTADGGQ I SLDWFDNDNS TCYMDASTRPTI 
LLLPGLTGTSKESYILHMIHLSEELGYRCWFNNRGVAGENLLT 
PRTYCCANTEDLETVIHHVHSLYPSAPFLAAGVSMGGMLLIiNYL 
GKIGSKTPLMAAATFSVGWNTFACSESLEKPLNWLLFNYYLTTC 
LQSSVNKHRHMFVKQVDMDHVMKAKSIREFDKRFTSVMFGYQTI 
DDYYTDAS PS PRLKSVG I PVLCLNS VDDVFS PSHAI P I ETAKQN 
PNVALVLTS YGGH I GFLEG I W PRQST YMDRVFKQ FVQAMVEHGH 
ELS 


6809 


939 


65 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDEAAQTDSQPLHPSDPTEKQQPKRLHVSNIPFRFRDPDLRQMF 
GQFGKILDVEI I FNERGSKGFGFVTFETSSDADRAREKLNGTIV 
EGRK I E VNNATAR VMTNKKTGNP YTNGWKLNP WGAVYGP E FYA 
VTGFPYPTTGTAVAYRGAHLRGRGRAVYNTFRAAPPPPP I PTYG 
AWYQDGFYGAE I \LEATQPTDTLS PLQRRQPTAT VTAES TQLP 
TRTITPSGPRRPTALEPCETFHRFLLGP 


6B10 


939 


£5 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDEAAQTDSQPLHPSDPTEKQQPKRLHVSNIPFRFRDPDLRQMF 
GQFGKILDVEI I FNERGS KGFGFVTFETSSDADRAREKLNGTI V 
EGRKIEVNNATARVMTNKKTGNPYTNGWKLNPWGAVYGPEFYA 
VTGFPYPTTGTAVAYRGAHLRGRGRAVYNTFRAAP PPPPI PTYG 
AWYQDGFYGAE I \ LEATQ PTDTLS P LQRRQPTATVTAE S TQLP 
TRTITPSGPRRPTALEPCETFHRFLLGP 


6811 


1522 


658 


DLVT VWS FVDCR VI ASTHGH \ KSWVS WAFDP YTTSVEEGDPME 
FSGS DED FQDLLHFGRDRADS TQCRLS RRNS TDS R P VSVT YRFG 
SVGQDTQLCL WDLTED I L FPHQPLSRARTHTNVMNATS P P AGSN 
GNSVTTPGNSVPPPLPRSNSLPHSAVSNAGSKSSVMDGAIASGV 
SKFATLS LHDR KERHHE KDHKRNHSMGHI S S KS S DKLNLVTKTK 
TDPAKTLGTPLCPRMEDVPLLEPLICKKIAHERLTVLIFLEDCI 
VTACQEGF I CTWGRPGKWSFNP 


6812 


4001 


1*82 


EDAVFSLDLSTIIQGTWFLNGEELKSNEPEGQVEPGALRYRIEQ 
KGLQHRL I LHAVKHQDSGAL VGFS C PGVQDSAALT I QES P VH I L 
SPQDKVSLTFTTSERWLTCELSRVDFPATWYKDGQBCVEESELL 
WKMDGRKHRL I L PEAKVQD SGE FE CRTEGVS AFFGVTVQDP PV 
HI VDPREHVF VHA I TS EC VMLACE V\DR \ EDAP VR W YKDGQE VE 
ESDFWLENEGPHRRLVLPATQPSDGGEFQCVAGDECAYFTVTI 
TDVSSWIVYPSGKVYVAAVRLERWLTCELCRPWAEVRWTKDGE 
EWESPALLLQKEDTVRRLVLPAVQLEDSGEYLCEIDDESASFT 
VTVTEPPVRIIYPRDEVTLIAVTLECWLMCELSREDAPVRWYK 
DGLEVEES EALVLERDGPRCRLVLPAAQP EDGGE FVCDAGDDSA 
FFTVTVTEPPVQFLALETTPSPLCVAPGEPWLSCELSRAGAPV 
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SEQ 
ID 

NO: 1 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
PssProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, YoTyrosine, X-Unknown, +-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VW S HNGRP VQEG EGLELHABG PRR VLCIQAAG PAHAGL YT CQSG 
AAPGAPSLS FTVQVAEPPVRWAPEAAQTRVRSTPGGDLELVVH 
LSGPGGPVRWYKDGERLASQGRVQLEQAGARQVLRVQGARSGDA 
GEYLCDAPQDSRIFLVSVEEPLLVKLVSDLTPLTVHEGDDATFR 
CEVS P PDADVTWLRNGAVVTPGPQRQSCCS YGGCRMCGQRKART 
CVSKWRQAEWVQRGPCAGCEVGSPCPTTLACPWPRMGTSTASSS 
MVSYWPTRAPTAARATTIAPWPGSA 


6813 


9 


836 


SSTQQRPGVPAGPRPLDGYLGVADHKPLKMHCRDCALVTSSGHL 
LHS RQGS Q I DQT E CV I RMNDAPTRG YGRDVGNRTSLRV I AHS S I 
QRILRNRHDLLNVSQGTVFI FWGPSS YMRRDGKGQVYNNLHLLS 
QVLPRLKAFM I TRHKM LQ FDELF KQE TGQ\NRK I SNTWLS TGWF 
TMTIALELCDRINVYGMGPPDFCRDPNHPSVPYHYYEPFGPDEC 
TMYLSHERGRKGSHHRFITEKRVFKNWARTFNIHFFQPDWKPES 
LAINHPENKPVF 


6814 


3 


737 


KFRRQEAN/ARERNRMHGLNDALDNLRKWPCYSKTQKLS KIET 
LR1AKNYI WALS E I LRIG KR PDLLTFVQNLCKGLS QPTTNLVAG 
CLQLNARS FLMGQGGEAAHHTRS P YSTFYP P YHS PELTTP PGHG 
TLDNSKSMKPYNYCSAYESFYESTS PECAS PQFEGPLS PPPINY 
NGIFSLKQEETLDYGKNYNYGMHYCAVPPRGPLGQGAMFRLPTD 
SH F P YDLHLRS Q S LTMQDE LNAV FHN 


6815 


906 


553 


QGLDPASQTKWELLKDGSGRRGDRRSSRDMAGGAGPRSESDLE 
DVGPTAEWNGDGSGSLRRSGSFGKLRDALRRSSEMLVKKLQGGT 
PQ E P PNPRMKRAS S LN FLNKS VEE P TQ PGG 




1 


803 


NLLKTHKF\LLGQDEDSLHSVPVAQMGNYQHYLKTLASPLREID 
PDQPKRLHTFGNPFKQDKKGMMIDEADEFVAGPQNKVKRPGEPN 
SPMSSKRRRSMSLLLRKPQTPPTVTNHVGGKGPPSASWFPSYPN 
LI KPTLVHTDAT I IHDGHEEKMENGQ I TPDGFLSKSAPS ELI NM 
TGDLMPPNQVDSLSDDFTSLSKDGLIQKPGSNAFVGGAKNCSLS 
VDDQKDPVASTLGAMPNTLQITPAMAQGINADI KHQLMKEVRKF 
GRSK 


6817 


172 


3457 


LGMMDS P KIGNGLPVIGPGTD IGI S SLHMVGYLGKNFDS AKVPS 
DE YCPACKEKGKLKALKTYRI SFQES I FLCEDLQCI YPLGSKSL 
NNL I S P DLEE CHTPHKPQ KRKSLES S Y KDSLLLANS KKTRNY I A 
IDGGKVLNSKHNGEVYDETS S NLPDSSGQQNP IRTADSLERNE I 
LEADTVDMATTKDPATVDVS GTGRPS PQNEGCTS KLEMPLES KC 
TS FPQALCVQWKNAYALCWLDCILSALVHSEELKNTVTGLCS KE 
ES IFWRLLTKYNQANTLLYTSQLSGVKDGDCKKLTSEI FAEIET 
CLNEVRDE I F 1 S LQ PQLRCTLGDME S P VFAFPLLLKLETH I E KL 
FLYSFSWDFECSQCGHQYQNRHMKSLVTFTNVIPEWHPLNAAHF 
GPCNN CN S KSQ I RKMVLE KVS P I FMLH F VEGL PQNDLQHYAFH F 
EGCLYQITSVIQYRANNHFITWILDADGSWLECDDLKGPCSERH 
KKFEVPASEIHIVIWERKISQVTDKEAACLPLKKTNDQHALSNE 
KPVSLTSCSVGDAASAETASVTHPKDISVAPRTLSQDTAVTHGD 
HLLSGPKGLVDNILPLTLEETIQKTASVSQLNSEAFL\LENKPV 
AENTGILKTNTLLSQESLMASSVSAPCNEKLIQDQFVDISFPSQ 
VVNTNMQSVQLNTEDTVNTKSVNNTDATGLIQGVKSVEIEKDAQ 
LKQFLTPKTEQLKPERVTSQVSNLKKKETTADSQTTTSKSLQNQ 
S LKENQKKP FVGS WVKGL I S RGAS FMPLCVSAHNRNT I TDLQ PS 
VKGVNNFGGFKTKGINQKASHVSKKARKSASKPPPISKPPAGPP 
S SNGTAAHPHAHAAS EVLE KSGSTS CGAQLNHS S YGNG I S SANH 
EDLVEGQI HKLRLKLRKKLKAEKKKLAALMSS PQSRTVRS ENLE 
QVPQDGS PNDCES IEDLLNELP YPIDIANESACTTVPGVSLYSS 
QTHEEILAELLSPTPVSTELSENGEGDFRYLGMGDSHIPPPVPS 
EFNDVSQNTHLRQDHNYCSPTKKNPCEVQPDSLTNNACVRTLNL 
ES PMKTD I FDEFFSS SALNALANDTLDLPHFDE YLFENY 


6818 


2 


240 


RGFDKVLWX/LSGAVK\CVQFSRISPDGEEGYPGELKVWVTYTL 
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corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
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corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, YsTyrosine, X=Unknovm, *-Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








dgge/lhs/attehkp/vqatpvnlt\tiltstwqarlpqi 


6819 


1 


961 


G I PCTEMGNFDNANVTGE IE FAI HYCFKTHSLE I CI KACKNLAY 
GEEKKKKCNPYVKTYLLPDRSSQGKRKTGVQRNTVDPTFQETLK 
YQVAPAQLVTRQLQVS VWHLGTLARRVFLGE VI I PLATWDFEDS 
TTQSFRWHPLRAKADKYEDSVPQSNGELTVRAKIjVLPSRPRKLQ 
EAQEGTDQPSLHGQLCLWLGAKNLPVRPDGTLNSFYKGCLTLP 
DQQKLRLKS PVLRKQACPQWKHS F VFSGVTPAQLRQ S SLELTVW 
DQALFGMNDRLLGGT\RIjGSKGDTAVGGDACSQSKLQWQKVLSS 
PNLWTDMTLVLH 1 


6820 


1014 


340 


GDMVYIVGHVPPGFFEKTQNKAWFREGFNEKYLKWRKHHRVIA 
GQFFGHHHTDSFRMLYDDAGVPISAMFITPGVTPWKTTLPGWN 
GANNPAIRVFEYDRATLSLKDMVTYFMNLSQANAQGTPRWELEY 
QLTEAYGVPDASAHSMHTVLDRIAGDQSTLQRYYVYNSVSYSAG 
VCTEACSMQHVCAMRQVDIDAYTTCLYASGTTPVPQLPLLLMAL 
LGLCT 


6821 


1088 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPTVHPIQSPQN 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFSLIEGYI\SIVMDAETQKKFPSDLLLTSSSGELWRMVRIG 
GQPLGFDECGI VAQ I AGP LAAAD I SAYYI STFNFDHALVPEDGI 
GSVIEVLQRRQEGLAS 


6822 


1088 


518 


EFDI YR/EVGGEFVPVTRDDSSNGFPRTQHGPS PTVHP IQS PQN 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFS L I EG Y I \ S I VMDAE TQ KKF P S DLLLTSS S GELWRM VRI G 
GQ PLG F DE CG I VAQ I AGPLAAAD I S AY Y I S T FNFDHALVPEDG I 
GSVIEVLQRRQEGLAS 


6823 


654 


221 


P PKLLS RWARMGHGDE I V\LSDLNFPGLLHLPWGPWRS VQTAC 
GI PQLLEAVLKLLPLDTYVE SPAAVMELVPSDKERGLQTPVWTE 
YES ILRRAGCVRALAKIERFEFYERAKKAFAVVATGETALYGNL 
ILRKGVLALNPLL 


6824 


858 


104 


LLLAQR WGWG \ CCFFSLAVS VKMNVLLFAPGLLFLLLTQFGFRG 
ALPKLG I CAGLQWLGLPFLLENPSG YLSRS FDLGRQFLFHWTV 
NWRFLPEALFLHRAFHLALLTAHLTLLLLFALCRWHRTGESILS 
LLRDPSKRKVPPQPLTPNQIVSTLFTSNFIGICFSRSLHYQFYV 
W YFHTLP YLLWAM PARWLTHLLRLL VLG L IE LS WNT YP STS CS S 
AALHICHAVILLQLWLGPQPFPKSTQHSKKAH 


6825 


3 


1173 


SSGEFGLQASDIMWTISDTGWILIILCSLMEPWALGACTFVHLL 
PKFD PLVI LKTLS S YP I KSMMGAP I VYRMLLQQDLS S YK FPHLQ 
NCLAGGESLLP ETLENWRAQTGLD IRE FYGQTETGLTCMVS KTM 
KI KPG YMGTAAS CYD VQI IDDKGNVL PPGTEGDIG I R VKP I R P I 
GI FSG YVDNPDKTAAN IRGDFWLLGDRG IKDEDGYFQFMGRADD 
IINSSGYRIGPSEVENALMEHPAWETAVISSPDPVRGEWKAF 
VILALQFLSHDPEQLTKELQQHVKSVTAPYKYPRKIEFVLNLPK 
TVTGKIQRA\KLRDKEWKMSGKAPCAVRHLRDIHLDSPLLSLSF 
PFGPLALPMDGYGDSLWEEHEYKFCLALVISTKLYHVRC 


6826 


2304 


954 


LKTESFKPW/VNIALAFHLLGERASPNSFWQPYIQTLPREYDTP 
LYFEEDEVRYLQSTQAIHDVFSQYKNTARQYAYFYKVIQTHPHA 
NKLPLKDS FTYEDYRWAVS S VMTRQNQI PTEDGSRVTLALI PLW 
DMCNHTNG LITTG YNLEDDRCECVALQD FRAGEQI Y I FYGTRSN 
AEFVIHSGFFFDNNSHDRVKIKLGVSKSDRLYAMKAEVLARAGI 
PTSSVFALHFTEPPISAQLLAFLRVFCMTEEELKEHLLGDSAID 
RIFTLGKTSEFPVSWDNEVKLWTFLEDRASLLLKTYKTTIEEDKS 
VLKNHDLSVRAKMAIKLRLGEKEILEKAVKSAAVNREYYRQQME 
EKAPLPKYEESNLGLLESSVGDSRLPLVLRNLEEEAGVQDALNI 
RE AI S KAKATENGLVNGENS I PNGTRS ENES LNQE S KRAVEDAK 
GSSSDSTAGVKE 
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Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H°Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6827 


1 


779 


SSWEFGLSVLGGLFLLFVLENMLGLLRHRGLRPRCCRRKRRNL 
E TRNLD PENG SGMALQ PLQAAPE PG AQGQRE KNSQH P PALAP PG 
HQGHSHGHQGGTD I TWMVLLGDGLHNLTDGLA IGAAFS DGFS S G 
LSTTLAVFCHELPHELGDFAMliLQSGLSFRRLLLLSLVSGALGL 
GGAVXGVGLS LG P VPLT P WVFGVTAGVFLYVAL VDML PALFP S S 
GAP AYA\ HVLLQG LGLLLGG CLM LAI TLLEERLLP VTTE G 


6828 


3 


1654 


LGHLSQTASLKRGSSFQSGRDDTWRYKTPHRVAFVEKLTKLVLS 
QL PNFW KLW I S YVNG S L FSETAE KSGQ I ERS KNVRQRQNDFKKM 
I QEVMHS LVKLTRGALLPLS I RDGEAKQ YGG WEVKCELSGQWLA 
HAI QTVRLTHESLTALE I PNDLLQTI QDLILDLRVRCVMATLQH 
TAEE I KRLAE KEDW I VDNEGLTS L P CQFE QC I VCSLQS L KG VLE 

DTTHLSVDVSSPDLFGSIHEDFSLTSEQRLLIVLSNCCYLERHT 
FLNIAEHFEKHNFQGIEKITQVSMASLKELDQRLFENYIELKAD 
PIVGSLEPGIYAGYFDWKDCLPPTGVRNYLKEALVNIIAVHAEV 
FTI SKELVPRVLSKV I E AVSEELS RLMQCVS S FS KNGALQARLE 
ICALRDTVAVYLTPESKSSFKQALEALPQLSSGADKKLLEELLN 
KFKSSMHLQLTCFQAASSTMMKT 


6829 


1 


782 


MRMEAGEAAP PAGAGGRAAGGWGKWVRLNVGGTVFLTTRQTLCR 

KLVLDKDMAEEGVLBEAE FYNI GPL I RI I KDRMEEKDYTVTQVP 
PKHVYRVLQCQEEELTQMVSTMSDGWRFEQLVNIGSSYNYGSED 
QAEFLCWSKELHSTPNGLSSESSRKTKSTEEQLEEQQQQEEEV 
EEVEVEQVQVEADAQE K/ CCYKPEAPGCEAPDHLQGLGVP I 


6830 


1 


939 


MEPGSVENLSIVYRSRDFLWNKHWDVRIDSKAWRETLTLQKQL 
R YR F PELAD PDTC YG FRF CHQLD FS TSGALC VALNKAAAGS A YR 
CFKERRVTKAYLALLRGHIQESRVTISHAIGRNSTEGRAHTMCI 
EGS QG CENP KPS LTDL WLEHGL YAGDP VS KVLLKPLTGRTHQL 
pv\ WPQAT.fJ'H'PwnnT« r rv , fZPVQrtPP'nppPT5MR/rr ua c*vt d tottvt 

ECVEVCTFDPFLPSLDACWSPHTLLQSLDQLVQALRATPDPDPE 
DRGPRPGS PS ALLPGPGRPPPP PTKP PETEAQRGPCLQWLSE WT 
LEPDS 


6831 


3 


1087 


SLFFGSSTPDNKVAEQEDLETQPSPSVEKAVTVIDPEGTIPTNF 
NVAEKPADHSLSEVKLKTADEPRGTLVKSGDGQNVKEKSMILSN 
VEDLQQPKF I SEVSREDYGKKE I SGDS EEMN INS WTSADGENL 
EIQS YS LIGEKLVMEEAKT I VPPHVTDSKRVQKPAI APPS KWNI 
SIFKEE PRSDQKQKSLLS FDWDKVPQQPKSAS SNFASKNI TKE 
SEKPESIILPVEESKGSLIDFSEDRLKKEMQNPTSLKISEEETK 
LRSVSPTEKKDNLENR\SYTIi\AEKKVLAEKQNSV\APLELRDS 
NE I GKTO ITLG S RS TE "LKE fi KADAM POHPVnTJPD VHP P P V T T Vf: 
SEKEKDEKKKK 


6832 


1809 


412 


MGS GL I S GP P QDNS GEALKE PERAQEHS LPNFAGG QH FFE YLLV 
VSLKKKRSEDDYEPI ITYQFPFCRENLLRGQQEEEERLLKAI PLF 
CFPDGNEWASLTEYPRETFSFVXTNVDGSRKIGYCRRLLPAGPG 
PRLP KVYC 1 1 S C I GCFGLFS KI LDE VE KRHQ I SMAVI YP FMQGL 
REAAFPAPGKTVTLKSFIPDSGTEFISLTRPLDSHLEHVDFSSL 
LHCLSFEQILQIFASAVLERKIIFLAEGLSTLSQCIHAAAALLY 
PFS WAHT YI P WPE S LLATVCCPTP FMVGVQMRFQQEVMDS PME 
E VLLVNLCEGTFLMS VGDE KDI LP P KLQDDI LDS LGQGINELKT 
AEQ INEHVS G P FVQ F FVKI VGH YAS Y I KREANGQGH FQERS FCK 
ALTSKTNRRFVKKFVKTQLFSLFIQEAEKSKNPPAGYFQQKILE 
YEEQKKQ/TETKGKNCE I RAWNKND 


6833 


1 


1129 


PLMTLSQCGGIPGHGHSHGGHGHGHGLPKGPRVKSTRPGSSDIN 
VAPGEQGPDQEETNTLVANTSNSNGLKLDPADPENPRSGDTVEV 
Q VNGNL VRE PDHME L E EDRAGQ LNMRG VFLHVLG DALG S V I WV 
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ID 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine / D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V«Valine, 
WaTryptophan, Y» Tyro sine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NALVFYFS WKGCSEGDFCVNPCFPDPCKAFVE I INSTHAS VYEA 
G P CWVL Y LD PT LCWM VCI LL YTT YPLLKE SAL I LLQT VP KQ I D 
IRNLIKELRNVEGVEEVHELHVWQLAGSRIIATAHIKCEDPTSY 
M EVAKTI KDVFHNHG IHATT I Q P E FASVGS KS S WPCELACRTQ 
CALKQCCGTLPQAPSGKDAEKTPAVSISCLELSNNLEKKPRRTK 
AENIPA\WIEIKN\IPNK\QPESSL 


6834 


78 


1151 


AGQERPAPIWRLLWLPTPSVSRKAEPAHIPINR*GA*E*RGGLP 
LCGSSASAYGWH*RLTPWSPGGS*HM*SSKAPVTQAREVLVAGP 
CSKLVLSGARGIVGTTVQVLVEAQQPLLLLFTGVWGLNLRAGEE 
S RAL * LI EEVTQ VRDAHLGNAWG CAQCLS QGQ VGSALAKALLE 
AAAAVRDCKEVLTVSGDKQQAEVSVRL*VRDVCVEEAGCVEFGQ 
AHGRPGLALAKGRGGTNEVEEQVQVDGVQKLVLSAHECHELVAG 
QQDGEDQAARTRLLQAGAHSVAHGRRQGQAPCRPHQEAGVSCHE 
LQQWGDAL*ARE+APQ I IVLLLLEDVAQLRTGKKA*DLWDVE 
QLLRQL I 


6835 


1 


834 


G I P AADR \ E AS LEL I KLD I SRT F PNLC I FQQGG P YHDMLHS I LG 
AYTC YRPD VG YVQGMS F I AAVL I LNLDTADAF I AFSNLLNKPCQ 
MAF FRVDHGLMLTY FAAFE VF F E ENLPKLFAHF KKNNLTPD I YL 
IDWI FTLYSKSLPLDLACRIWDVFCRDGEEFLFRTALGI LKLFE 
D I LTKMDF I HMAQF LTRLP EDL PAEE LFAS I AT I QMQS RN KKWA 
QVLTALQKDSREMREGKSVPPTLRLQREFALGTNQSPMPRPLCC 
FRLTPGQPRRTDAL 


6836 


1 


850 


MSCGRPPPDVDGMITLKV\DNLTYRTSPDSLRRVFEKYGRVGDV 
Y I PRE ^HTKAPRGFAFVRFHDRRDAQDAEAAMDGAELDGRELRV 
QVARYGRRDLPRSRQGRRHAAGPEAA/RYGRRSRSYGRRSRSPR 
RRHRSRSRGPS CSRSRSRSR YRG S R YS RSP YSRS P YSRSR YS RS 
PYSRSRYRESRYGGSHYSSSGYSNSRYSRYHSSRSHSKSGSSTS 
SRSASTSKSSSARRSKSSSVSRSRSRSRSSSMTRSPPRVSKRKS 
KSRSRSKRPPKS PEEEGQMSS 


6837 


1 


1369 


TDGAAVAGNPGS DYFPGGTAP /GGPRTRRP \SGTSSSGS KASGP 
PNPPAQGDGTSLSPNYTLESTSGNDGKPVSGGGGRGRGRRKRDS 
GHVS PGTFFDKYSAAPDSGGAPGVS PGQQQAS GAAVGG S S AGET 
RGAPTPHEKALTSPSWGKGAELLLGDQPDLIGSLDGGAKSDSSS 
PNVGEFASDEVSTSYANEDEVSSSSDNPQALVKASRSPLVTGSP 
KLPPRGVGAGEHGPKAP PPALGLG IMSNSTSTPDS YGGGGGPGH 
PGTPG LEQVRTP TS SS GAP P PDE I HPLEI LQAQ I QLQRQQ FS I S 
EDQPLGLKGGKKGECAVGASGAQNGDSELGSCCSEAVKSAMSTI 
DLDS LMAEH S AAWYMP ADKALVDS ADDDKTLAP WE KAKPQNPNS 
KEAHDLPANKASASQPGSHLQCLSVHCTDDVGDAKARASVPTWR 
SLHSDISNRFGTFVAALT 


6838 


16 


499 


LTDT P P P KTHM I HHS I S D Y KATLRCWALGFY PME I TLTWQQDE E 
DQTRDMELVETRPAGDGTFQKWAAVWPSGEE / Q/RYMCHVQHE 
GLPE PLTLRWEQSSQPT I PI VGI VAGLVLLGAWTGAWSAVMC 
RKKNSDRVSYSEAASSDHAQGSDVSLTACKV 


6839 


1 


1195 


AAPAGGGPDPEALSAFPGRHLSGLSWPQVKRLDALLSEPIPIHG 
RGNFPTL S VQ PRQ I RAGG PQHPGGAG \ IHVHRVRLHGSAASHVL 
HPESGLG YKDLDLVFRMDLRS EAS FQLTKAWLACLLDFLPAGV 
SRAKITPLTLKEAYVQKLVKVCTDSDRWSLISLSNKSGKNVELK 
FVDSVRRQFEFS IDSFQI ILDSLLLFGQCSSTPMSEAFHPTVTG 
ESLYGDFTEALEHLRHRVI ATRS PEE I RGGGLLKYCHLLVRGFR 
PRPSTDVRALQRYMCSRFF I DFPDLVEQRRTLERYLEAHFGGAD 
AARRYACLVTLHRWNESTVCLMNHERRQTLDL IAALALQALAE 
QGPAATAALAWRPPGTDGWPATVNYYVTPVQPLLAHAYPTWLP 
CN 


6840 


4254 


2061 


ELQGDFSVPDVPKSMAWCENS I CVGFKRDYYLIRVDGKGS I KEL 
FPTGKQLEPLVAPLADGKVAVGQDDLTWLNEEG I CTQKCALNW 



560 



WO 01/53312 



PC17US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepticfe"™" 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S=Serine, T«Threonine, V«Valine, 
W»Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TDIPVAMEHQPPYIIAVLPRYVEIRTFEPRLLVQSIELQRPRFI 
TSGGSNI I YVASNHFVWRLI PVPMATQIQQLLQDKQFELALQLA 
EMKDDSD S E KQQQ I HH I KNL YAFNLFCQKRFD ESMQ V FAKLGTD 
PTHVMGLYPDLLPTDYRKQLQYPNPLPVLSGAELEKAHLALIDY 
LTQKRSQLVKKLNDSDHQSSTSPLMEGTPTIKSKXKLLQIIDTT 
LLKCYL^TNVALVAPLLRLENNHCHIEESEHVLKKAHKYSELII 
LYEKKGLHEKAU2VLVIX3SKKANSPLKGHERTVQYLQHLGTENL 
HLIFSYSVWVLRDFPEDGLKIFTEDLPEVESLPRDRVLGFLIEN 
F KG LAI PYLEHI IHVWEETGSRFHNCLIQLYCEKVQGLMKEYLL 
SFPAGKTPVPAGEEEGELGEYRQKLLMFLEISSYYDPGRLICDF 
PFDGLLEERALLIX3RMGKHEQALFIYVHILKDTRMAEEYCHKHY 
DRNKIX3NKDVYLSLLRMYLSPPS IHCLGP I KLELLEPKANLQAA 
LQVLELHHSKLDTTKALNLLPANTQINDIRIFLEKVLEENAQKK 
RFNQVIjKNLLHAEFLRV\ QEER I LHQQVKCI ITE EKVCMVCKKK 
IGNSAFARYPNGVWHYFCS \KEVNPADT 


6841 


1 


3206 


TPSTTGTKSNTPTSSVPSAAVTPLNESLQPLGDYGVGSKNSKRA 
RE KRDS RNME VQ VTQEMRNVS IGMGSSDEWSDVQDI IDSTPELD 
MCPETRLDRTGS SPTQG I VNKAFG INTDS LYHELSTAGSE VIGD 
VDEGADLLGEFSGMGKEVGNLLLENSQLLETKNALNWKNDLIA 
KVDQLSGEQEVLRGELEAAKQAKVKLENR I KELEEELKRVKSEA 
I IARREPKEEAEDVSS YLCTESDKI PMAQRRRFTRVEMARVLME 
RNQYKERLM ELQEAVRWTEM I RAS REHPS VQEKKXS TIWQFFSR 
LFSSSSSPPPAKRPYPSGNIHYKSPTTAGFSQRRNHAMCPISAG 
SRPLEFFPDDDCTSSARREQKREQYRQVREHVRNDDGRLQACGW 
SL P AKY KQ LS PNGGQEDTRM KNVP VP VYCRPLVEKDPTMKLWCA 
AGVNLSGWRPNEDDAGNGVKPAPGRDPLTCDREGDGEPKSAHTS 
PEKKKAKELPEMDATSSRVWILTSTLTTSKWIIDANQPGTWD 
QFTVCNAHVLCI SS IPAA5DSDYPPGEMFLDSDVNPEDPGADGV 
LAGITLVGCATRCNVPRSNCSSRGDTPVLDKGQGEVATIANGKV 
NPS QS TE EATEATE VPDPG P S E PETATLRPG PLTEHVFTD PAPT 
PSSGPQPGSENGPEPDSSSTRPEPEPSGDPTGAGSSAAPTMWLG 
AQNGWL YVHSAVANWKKCLHS I KLKDS VLSLVHVKGRVLVALAD 
GTLAIFHRGEDGQWDLSNYHLMDLGHPHHSIRCMAWYDRVWCG 
YKNKVHVI Q P KTMQ I E KS FDAHPRRE S QVRQ LAW I GDGVW VS I R 
LDSTLRL YHAHTHQHLQDVD I EPYVS KMLGTGKLGFS FVRITAL 
L VAGS RLW VGTGNG WIS I PLTET WLHRGQ \ LLG \ LRANKTS P 
TSGEG\ARPGG\ I IHVYG \DDS SDRAARS FI P YCSMAQAQLCFH 
GHRDAVK FFVS VPG WLATLNG S VLDS P AEG PG PAAPAS E VEGQ 
KLRNVLVLSGGEGY I DFR I GDGEDDETE EGAGDMS QVKPVLS KA 
ERSHI I VWQVS YTPE 


6842 


3 


926 


RCQQLSATILTDHQYLERTPLCAILKQKAPQQYRIRAKLRSYKP 
RRLFQS VKLHCP KCHLLQE V PHEGDLD 1 1 FQDGAT KT PD VKLQN 
TSLYDSKIWTTKNQKGRKVAVHFVKNNGILPLSNECLLLIEGGT 
L SE ICKLSNKFNS V I P VRSGHED LELLDLS AP FL I QGTVHHYG C 
KQWST*RSIQNLNSLVDKTSWIPSSVAEALGIVPLQYVFVMTFT 
LDDGTGVLEAYLMDS DKFFQ I PASEVLMDDDLQKS VDMIMDMFC 
PPGIKIDAYPWLECFIKSYNVTNGTDNQICYQIFDTTVAEDVI 


6843 


2 


851 


NHRKVLSGAKRYECNECGKS FAYTSS LI KHRRIHTGERPYECSE 
CGRSFAE NS SL I KHLRVHTGER P YE C VE CG KS FRRS S S LLQHQ R 
VHTRERPYECSECGKSFSLRSNLIHHQRVHTGERHECGQCGKSF 
S RKSSL I IHLR VHTGERP YECSD CGKS FAENSS L I KHLR VHTG E 
RP YEC I D CGKS FRH S S S FRRHQRVHTGMR P YK* S KFW KFSCPGF 
LLLQGQRVHTGSRCYECDKWGI FFS*NASFFT* KSAPTEEVPFE 
CNECEKAFS PLSLVTTI FT 


6844 


244 


642 


EHQLAGFELRKTQTSMSLGTTREKTDRVKSTAYLSPQELEDVFY 
QYDVKSE I YS FG I VLWEIATGDI P FQGCNSE KIRKLVAVKRQQE 
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Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLGSDCPSELREIIDECRAHDPSVRPSVDEILKKLSTFSK*CIK " 
I 


6845 


3 


1519 


vawdecywrhvfwdqdlwmllfilmchp^tararleyriITtld 

GALENAQNLG YQGAKFAWES ADSGLE VCP ED I YGVQE VHVNGAV 
GIAFELYYHTTQDLQLFREAGGWDVVRAVAEFWCSRVEWSPREE 
KYHIiRGVMS PDEYHSG VNNSVYTNVLVQNSLRFAAALAQDLGLP 
I PSQWLAVADKIKVPFDVEQNFHPEFDGYEPGEWKQADWLLG 
YPVPFSLSPDVRRKNLEIYEAVTSPQGPAMTWSMFAVGWMELKD 
AVRARGLLDRSFANMAEPFKVWTENADGSGAVNFLTGMGGFLQA 
WFG CTG FR VTRAGVTFD P VCLSG I S RVS VS G I F YQGNKLNFS F 
SEDSVTVEVTARAGPWAPHLEAELWPSQSRLSLLPGHKVSFPRS 
AGRIQMSPPKLPGSSSSEFPGRTFSDVRDPLQSPLWVTLGSSSP 
TESLTVDPASE*SGTGASETSLGPSLWPRLHPPLLGTLLACHPS 
PAARLSGKVHAAWPEFKAFCL 


6846 


213 


1258 


L Y FL KT I K * LNRLAEHP * YENEKLT KLRNT I ME Q YTRT EE S ARG 
1 1 FTKTRQSAYALSQWITENEKFAEVGVKAHHLIGAGHSSEFKP 
MTQNEQKE VI SKFRTGKINLL 1ATTVAEEGLDI KECNI V I R YGL 
VTNE I AM VQARGRARAD E S T YVL VAHS G SG V I EHET VND FRE KM 
MYKA I HCVQNMKPEE YAHKI LELQMQS IMEKKMKTKRNIAKHYK 
KNPSLITFLCKNCSVLACSGEDIHVIEKMHHVNMTPEFKELYIV 
RENKTLQKKCADYQ INGE 1 1 CKCGQAWGTMMVHKGLDLPCLKIR 
NFVWFKNNS TKKQ YKKWVE L P I TF P NLD Y S EC CLFS DED 


6847 


1450 


348 


SMCWNSDRLEMPLIDLALILYPPSYVPYTGHLSDDSLSRKVCLT 
WFEDALNGVL*RAEAIQPHCVNAGDRMEKFRQKYWNKLQTLRQQ 
PFAYGTLTVRS LLDTREHCLNEFNFPDP YS KVKQRENGVALRCF 
PGWRSLDALGWEERQLALVKGLLAGNVFDWGAKAVSAVLESDP 
YFGFEEAKRKLQERPWLVDSYSEWLQRLKGPPHKCALIFADNSG 
IDIILGVFPFVRELLLRGTEVILACNSGPALNDVTHSESIjIVAE 
R I AGMD P WHS ALREERLLLVQTGS S S PCLDL S RLDKGLAAL VR 

ERGADLWI egmgravhtnyhaalrce slklavi knawlaerlg 

GRLFS VI FKYE VPAE 


684B 


19 


16 


AMWWN S LDG I RNI VLSNP KKRNTLS LAMLKS LQSD I LHDADS ND " 
LKVI 1 1 SAEGPVFSSGHDLKELTEEQGRDYHAEVFQTCSKVMMH 
IRNHPVPVI AMVNGLATAAGCQLVAS CDIAVASDKS S FATPGVN 
VGLFCSTPGVALARAVPRKVALEMLFTGEPISAQEALLHGLLNK 
WPEAELQEETMRrARKIASLSRPWSLGKATFYKQLPQDIiGrA 
YYLTSQAMVDNLAZiRDGQEGI TAFLQKRKPVWSHEPV*VEH 


6849 


70 


821 


SLGVDGSCLEQGSPAPRPQTDTSP*PVGNWATQQEDLYHQSYEC 
VCVLFASVPDFKEFYSESNINHEGLECLRLLNEIIADFDELLSK 
PKFSGVEKIKTIGSTYMAATGLNATSGQDAQQDAERSCSHLGTM 
VEFAVALGSKLDVINKHSFNNFRLRVGLNHGPWAGVIGAQKPQ 
YD I WGNTVNVAS RMESTGVLG KIQVTE ETAWALQS LG YTC YS RG 
VIKVKGKGQLCTYFLNTDLTRTGPPSATLG 


6850 


2 


1235 


ARGLNHEWTFEKLRQHISRNAQDKQELHLFMLSGVPDAVFDLTD 
LDVLKLEL I PEAKI PAKI SQMTNLQELHLCHCPAKVEQTAFS FL 
RDHLRCLHVKFTDVAE IPAWVYLLKNLRELYL IGNLNSENNKM I 
GLESLRELRHLKILHVKSNLTKVPSNITDVAPHLTKLVIHNDGT 
KLLVLNS LKKMMNVAELELQNCELER I PHAI FS L SNLQE LDLKS 
NNIRTI EEI IS FQHLKRLTCLKLWHNKI VTI PPS I THVKNLES L 
Y FSNNKLE SL P VAVFS LQKLRCLDVS YNN I SM I P I E I GLLQNLQ 
HLHITGNKVD ILPKQLFKCI KLRTLNLGQNCITSLPEKVGQLSQ 
LTQLE L KGNCLDRLP AQLGQCRMLKKS GLWEDHL FDTL P LE VK 
E ALNQD IN I P FANG I 


6851 


1765 


660 


VSAQVSAREGENCLGWNLADSSQESYKSLEEAEDCYPPSLLTLD 
LRDLFNQVEQGPLLS CPKAGTDLSMGRAREVGWMAAGLMIGAGA 
CYCVYKLTIGRDDSEKLEEEGEEEWDDDQELDEEEPDIWFDFET 
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amino acid 
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Amino acid segment containing signal peptide 
(A= Alanine, O Cysteine , D»Aspartic Acid, E== 
Glutamic Acid, F=Phenylalanine, G=Glycxne, 
H=Histidine, I=Isoleucine, K= Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








MARPWTEDGDWTEPGAPGGTEDRPSGGGKANRAHPIKQRPFPYE 
HKNTWSAQNCKNGSCVLDLSKCLFIQGKLLFAEPKDAGFPFSQD 
INSHLASLSMARNTSPTPDPTVREALCAPDNLNASIESQGQIKM 
YirTEVCRETVSRCCNSFLQQAGLNLLISMTVINNMLAKSASDLK 
FPLISEGSGCAKVQVLKPLMGLSEKPVLAGELVGAQMLFSFMSL 
FIRNGNREILliETPAP 


5852 


1 


407 


RTRGEET YANF I KHNDGKNI FYAARTPATLFAVMFAMY 1 1 SGLT 
GFIGLNSIAVLCmjVMGLALIFIiCTWAYVKYSGEFREIGTVIDQ 
I AETLWEQVLKPLGDNLMEENIRQS VTNS I KAGLTDQVSHHARL 
KTD 


6853 


3 


469 


GDSCAVCIELYKPNDLVRILTCNHIFHKTCVDPWLLEHRTCPMC 
KCD I LKALGI E VD VEDG S VS LQVP VS NE I FNS AS SHE EDNR SET 
AS S G YAS VQGT YE P PLE EHVQS TNE SLQLVNHE ANS VAVD VT PH 
VDNPTFEEDETPNQETAVREIKS 


6854 


1148 


585 


HESYIGTFDPGELCVCAAIQWLQDNSASYFLNRKLVYEPSTQAK 
PVKNTFLRMWIYSHHIYQQDLRKKILDVGKRLDVTGFCMTGKPG 
IICVEGFKEHCEEFWHTIRYPNWKHISCKHAESVETEGNGEDLR 
LFHSFEEHiIiEAHGDYGLRNDYHMNLGQFLEFLKKHKSEHVFQI 
LFGIESKSSDS 


6855 


1913 


1148 


GRVGGRVGRI CS PLSGANE Y IASTDTLKTEEVLLFTDQTDDLAK 
EEPTSLFQRDSETKGESGLVLEGDKEIHQIFEDLDKKLALASRF 
YIPEGCIQRWAAEMVVALDAIjHREGIVCRDLNPNNILLNDRGHI 
QLTYFSRWSEVEDSCDSDAIERMYCAPEVGAITEETEACDWWSL 
GAVLFELLTGKTLVECHPAGINTHTTLNMPEWVSEEARSLIQQL 
LQFNP LERLG AGVAGVED I KSHPFFTPVDWAELMR 


6B56 


1617 


' 997 


VTQ L YVS VDAS TKDS LKKI D RPL FKD FWQQFL DS LKALAVKQQR 
TVYRLTLVKAWNVDELQAYAQLVSLGNPDFIEVKGVTYCGESSA 
SSLTMAHVPWHEEWQFVRELVDLIPEYEIACEHEHSNCLLIAH 
RKFKIGGEWWTWINYNRFQELIQEYEDSGGSKTFSAKDYMARTP 
HWALFGASERGFDPKDTRHQRKNKSKAISGC 


6857 


1 


617 


KGP EATAM VCVCSHPNCRQNHI KPS HS AAQT W CGS P TPAS APNH 
KLMAME QGKTL P S ATEDAKEEGLEAQ I SRLAEL I GRLE S KALWF 
DLQQRLSDEDGTNMHLQLVRQEMAVCPEQLSEFLDSLRQYLRGT 
TGVRNCFHITAVRLSDGFTFVIYEFWETEEAWKRHLQSPLCXAF 
RHVKVDTIjSQPEALSRILVPAAWCTVGRD 


6858 


2 


669 


RSRGIKDFENDPPLSSCGIFQSRIAGDALLDSGIRISSVFASPA 
LRCVQTAKLILEELKLEKKIKIRVEPGIFEWTKWEAGKTTPTLM 
SLEELKEANFNIDTDYRPAFPLSALMPAESYQEYMDRCTASMVQ 
IVNTCPQDTGVILIVSHGSTLDSCTRPLLGLPPRECGDFAQLVR 
KI PS LGMCFCEENKEEGKWELVNP PVKTLTHGANAAFNWRNW I S 
GN 


6859 


1 


1150 


GETMFKKAKTKAKKKPRKRSDSSGGYNLSDIIQSPSSTGLLKSG 
KTNS VES LPE LLTS DSEGS YAGVG S P RDLQS PD FTTGFHS D K I E 
AKVKP YVNGTS PVYSRE DLKPWEKS P I LKISAPQPI PSNRIDTT 
SSASWVAGSFSPVSPPWDLRTIMEIEESRQKCGATPKSHLGKT 
VSHGVKLSQKQRKMIALTTKENNSGMNSMETVLFTPSKAPKPVN 
AWASSLHSVSSKSFRDFLLEEKKSVTSHSSGDHVKKVSFKGIEN 
SQAPKIVRCSTHGTPGPEGNHISDLPLLDSPNPWLSSSVTAPSM 
VAPVTFASIVEEELQQEAALIRSREKPLALIQIEEHAIQDLLVF 
YEAFGNPEEFVIVERTPQGPLAVPMWNKHGC 


6860 


1889 


1515 


DKD KKRQ KKRG I F P KVATN I MRAWLFQHLTHP YPS E EQKKQLAQ 
DTGLTILQVNNWFINARRIIVQPMIDQSNRAVSQGAAYSPEGQP 
MGSFVLDGQQHMGIRPAGPMSGMGMNMGMDGQWHYM 


6861 


1889 


1515 


DKD KKRQ KKRG I F P KVATN I MRAWL FQHLTHP Y PS EEQKKQ LAQ 
DTGLTILQVNNWFINARRIIVQPMIDQSNRAVSOGAAYSPEGQP 
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Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E«* 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
HaHistidine, I«Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
W=Tryptophan, Y=Tyrosine, XsUnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MGS FVLDGQQHMG IRPAGPMS GMGMNMGMDGQWHYM 


6862 


2 


471 


EEIDREFHNKLKLKEDKLEKQEKPVNGEDKGDSGVDTQNSEGNA 
DEEDPLGPNCYYDKTKSFFDNISCDDNRERRPTWAEERRLNAET 
FG I P LRPNRGRGG YRGRGGLG FRGGRGRGGGRGGTFTAPRGFRG 
GFRGGRGGREFADFEYRKTTAFGP 


6863 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCKQVCSTVGGS 
AICSCFPGYAIMADGVSCEDQDECLMGAHDCSRRQFCVNTLGSF 
YCVNHTVLCADGYILNAHRKCVDINECVTDLHTCSRGEHCVNTL 
GS FHCYKALTCE PG YALKDGECEDVDECAMGTHTCQPG FLCQNT 
KGSFYCQARQRCMDGFLQDPEGNCVDINECTSLSEPCRPGFSCI 
NTVGSYTCQRNPLICARGYHASDDGTKCVDVNECETGVHRCGEG 
QVCHNLPGSYRCDCKAGFQRDAFGRGCIDVNECWASPGRLCQHT 
CENTLGS YRCS CASGFLLAADG KRCE DVNE CE AQRCS QE CAN I Y 
GSYQCYCRQGYQLAEDGHTCTDIDECAQGAGILCTFRCLNVPGS 
YQCACPEQGYTMTANGRSCKDVDECALGTHNCSEAETCHNIQGS 
FR CLRFE CP PNYVQ VS KTKCERTTCHDFLECQNS PAR ITHYQLN 
FQTGLLVPAHI FRIGPAPAFTGDTIALNI IKGNEEGYFGTRRLN 
AYTGWYLQRAVLEPRDFALDVEMKLWRQGS VTTFLAKMHI FFT 
TFAL 


6864 


2 


2933 


LADSSPSNLQIIIKELLSMHHQPDPALTKEFDYLPPVDSRSSSG - 

FVGLRNGGATCYMNAVFQQLYMQPGLPESLLSVDDDTDNPDDSV 

FYQVQSLFGHLMESKLQYYVPENFWKIFKMWNKELYVREQQDAY 

EFFTSLIDQMDEYLKKMGRDQIFKNTFQGIYSDQKICKDCPHRY 

ERE EAFMALNLG VTS CQS LE I S LDQ FVRGEVLEGSNAY YC E KCK 

EKRI TVKRTCI KSLPS VLVIHLMRFGFD WESGRS I KYDEQ I RFP 

WMLNMEPYTVSGMARQDSSSEVGENGRSVDQGGGGSPRKKVALT 

ENYELVGVIVHSGQAHAGHYYSFIKDRRGCGKGKWYKFNDTVIE 

EFDLNDE TLE YE CFGGE YR P KVYDQTNP YTDVRRR YWNAYML FY 

QRVSDQNSPVLPKKSRVSWRQEAEDLSLSAPSSPEISPQSSPR 

PHRPNNDRLS I LTKLVKKGE KKGLFVEKMPAR I YQMVRDENLKF 

MKNRDVYSSDYFSFVLSLASLNATKLKHPYYPCMAKV'SLQIiAIQ 

FLFQT YLRTKKKLRVDTEE W I ATI EALLS KS FDACQ WLVE YF I S 

SEGREL I KI FLLECNVREVRVAVAT I LEKTLDS ALF YQDKLKSL 

HQLLEVLLALLDKDVPENCKNCAQYFFLFNTFVQKQGIRAGDLL 

LRHSALRHMISFLLGASRQNNQ IRRWS S AQAREFGNLHNTVALL 

VLHS DVS SQRNVAPG I FKQR P P I S I AP S S PLLPLHEE VEALLFM 

SEGKPYLLEVMFALRELTGSLLALIEMWYCCFCNEHFSFTMLH 

FIKNQLETAPPHELKNTFQLLHEILVIEDPIQVERVKFVFETEN 

GLLALMHHSNHVDSSRCYQCVKFLVTLAQKCP/\AKEYFKENSHH 

WSWAVQWLQKKMSEHYWTLQSNVSNETSTGKTFQRTISAQDTLA 

YATALLNEKEQSGSSNGSESSPANENGDRHLQQGSESPMMIGEL 

RSDLDDVDP 


6865 


1820 


1242 


DPERWKHLSKVTPPGSSVSTTPVQWRLQSPQSQGSMMPSCNRS 
CSCSRGPSVEDGKWYGVRSYLHLFYEGYAVPPKLEGIGEGEFLV 
LDQKAAD YNQALGTCRLAGTALCVAAGVLLAI CLFWAM IGWLSQ 
DTKAE PLDP E ADSHVE VFGDE PEQQLS P I FRNASGQS WFS PPAS 
PFGQSSVQTIQPKRDS 


6866 


1571 


495 


DCPRPRYTLYGLRATCMRDLDWAW INAVSAFKALEQDLP VNI KF " 
1 1 EGME EAGS VALEELVE KE KDRFFSG VD Y I V I S DNL W I S QRKP 
AITYGTRGNSYFMVEVKCRDQDFHSGTFGGILHEPMADLVALLG 
S LVDSSGHI L VPG I YDEWPLTEEE INT YKAI HLDLEE YRNS SR 
VE KFLFDTKEE I LMHLWR YPS LSI HGI EGAFDE PGTKT V I PGRV 
IGKFSIRLVPHMNVSAVEKQVTRHLEDVFSKRNSSNKMVVSMTL 
G LHP WI ANI DDTQYIiAAKRAI RTV FGTE PDM IRDGSTIPI AKM F 
QEIVHKSWLIPLGAVDDGEHSQNEKINRWNYIEGTKLFAAFFL 
EMAQLH 
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residue of 
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Predicted end 
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c o r r e s pond i ng 

to first 
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residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *»Stop j 
Codon, /«=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6867 


2833 


1704 


GTRI MSQPKQKELAGFVRQKMLLD YSVYMGRC VPQES RSPQRS P 
LQSAESSPTAGKKLPEVPPSEEEEQEAWVNALLGRIFWDFLGEK 
YWSDLVSKKI QMKLSKIKLP YFMNELTLTELDMGVAVPKI LQAF 
KP YVDHQGLW I DL EMS YNGS FLMTLETKMNLT KLG KE P L VEALK 
VGEIGKEGCRPRAFCLADSDEESSSAGSSEEDDAPEPSGGDKQL 
LPG AEG YVGGHRT S KI MRF VDKI TKS K YFQKATETE F I KKKI EE v 
VSNTPLIiLTVEVQECRGTLAVNIPPPPTDRVWYGFRKPPHVELK 
ARPKLGER EVTL VHVTDW I E KKL EQE FQKVFVM PNMDDVY I T IM 
HSAMDPRSTSCLLKDPPVEAADQP 


6868 


1 


346 


R PTR PPTR PEE I KNLILP Y I SDMNFVQDLCEDFYELFKTDKGFD 
KATFESQMSVMRGQILNLTQALRDGKSPFQLVQIPCVIVERSQG 
GSQGR I VHLS NS FTQTVNCRKP F FS S W 


6869 


3 


1619 


M YME RMD KRAL I S F WE S VEHLKNANKNE I PQLVGE I YQN F FVES 
KEI S VEKSLYKE IQQCLVGNKGIEVFYKIQEDVYETLKDRYYPS 
F I VS DL YE KLL I KEEE KHASQMI SNKDSMGPRDE AG EE AVDDGT 
NQ I NEQAS FAVNKLRE LNEKLE YKRQALNS IQNAP KPD KK I VS K 
LKDEI ILIEKERTDLQLHMARTDWWCENLGMWKASITSGEVTEE 
NGEQLPCYFVMVSLQEVGGVETKNWTVPKRLSEFHNIiHRKLSEC 
VPSLKKDQLPSLSKLPFKSIDHTFMEKFENQLNKFLQNLLSDER 
LCQSEALYAFLS PS PDYLKVI DVQGKKNS FSLS S FLERLPRDFF 
SHQEEETEEDSDLSDYGDDVDGRKDALAEPCFMLIGEIFELRGM . 
FKWVRRTLI ALVQVTFGRT INKQI RDTVS WI FS EQMLVYY I N I F 
RDAFWPNGKLAPPTTIRSKEQSQETKQRAQQKLLENIPDMLQSL 
VGQQNARHGI I KI FNALQETRANKHLLYALMELLLI ELCPELRV 
HLDQLKAGQV 


6870 


1 


1566 


MAAVVAATRWWQLLLVLSAAGMGASGAPQPPNILLLIjMDDMGWG 
DLGVYGEPSRETPNLDRMAAEGLLFPNFYSANPLCSPSRAALLT 
GRLPI RNGF YTTNAHARNA YTPQE I VGG I PDS EQLLPELLKKAG 
YVSKIVGKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARP 
NIPVYRDWEMVGRYYEEFP INLKTGE ANLTQI YLQEAIiDF I KRQ 
ARHHP FFLYWAVDATHAP VYAS KPFLGTS QRGR YGD AVRE I DDS 
I GKI LELLQDLHVADNTFVF FTSDNGAAL IS APEQGGSNGPFLC 
GKQTTFEGGMREPALAWWPGHVTAGQVSHQLGSIMDLFTTSLAL 
AGLTPPSDRAIDGLNLiLPTLLQGRLMDRPIFYYRGDTLMAATLG 
QHKAH FWT WTNS WENFRQG I D FCPGQNVS GVTTHNLEDHTKLPL 
IFHLGRDPGERFPLSFASAEYQEALSRITSWQQHQEALVPAQP 
QLNVCNWAVMNWAPPGCEKLGKCLTPPESIPKKCLWSH 


6871 


209 


1126 


RMSLNPP I FLKRSEENSSKFVETKQSQTTS I ASEDPLQNLCLAS 
QEVLQKAQQSGRSKCLKCGGSRMFYCYTCYVPVENVPIEQIPLV 
KLPLKID II KHPNETDGKSTAIHAKLLAPEFVNI YTYPCI PE YE 
EKDHEVALIFPGPQSISIKDISFHLQKRIQNNVRGKNDDPDKPS 
FKRKRTEEQEFCDLNDSKCKGTTLKKIIFIDSTWNQTNKIFTDE 
RLQGLLQVELKTRKTCFWRHQKGKPDTFliSTIEAIYYFIiVDYHT 
DILKEKYRGQYDNLLFFYSFMYQLIKNAKCSGDKETGKLTH 


6872 


880 


459 


FGLLMWLSLI FMKGNCVREDLI FNFLFKLGLDVRETNGLFGNT 
KKLITEy FVRQKYLE YRRI P YTEPAE YE FLWGPRAFLETSKMLV 
LRFLAKLHKKDPQSWPFHYLEALAECEWEDTDEDBPDTGDSAHG 
PTSRPPPR 


6873 


1929 


955 


DEQAVLCSKDKTYDLKIADTSNMLLFIPGCKTPDQLKKEDSHCN 
I IHTE I FGFSNKYWELRRRRP KLKKLKKLLMENPYEGPDSQKEK 
D SNSS KYTTEDLLDQ I QASE EE I MTQLQ VLNACKI GG YWR I LE F 
D YEMKLLNH VTQLVDS ES WS FGKVP LNTCLQELGP LE PEEM I EH 
CLKCYGKKYVDEGE VYFELDADKI CRAAARMLLQNAVKFNLAEF 
QEVWQQSVPEGMVTSLDQLKGLALVDRHSRPEIIFLLKVDDLPE 
DNQ BR FNSL FSLRE KWTEED IAP Y I QD LCG E KQT I GALLT K YS H 
S SMQNGVKVYNSRRP I S 
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Predicted end 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=isoleucine, K=Iiysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine,. R=Arginine, 
S=Serine, T=Threonine, VWaline, 
W=Tryptophan, Y -Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6874 


1 


307 


DSIADHVNSAAVNVEEGTKNLGKAAKYKLAALPVAGALIGGMVG 
GPIGLIAGFKVAGIAAALGGGVLGFTGGKLIQRKKQKMMEKLTS 
SCPDLPSQTDKKCS 


6875 


1688 


349 


VIGTGERGNSASEKWEIMFNEELGDPFIIIHSISLLNAEEHSIA 
TLLLR I E KEE LDMKGSG F YVSLE WVT I S KKNQDN KK YE 1 1 KRD I 
LRGKSVPHYAAIEPDGNGLMIVSYKSLTFVQAGQDLEENMDEDI 
S E K I KE PL Y YWQQTE DDLTVT I RLP EDNTKED I Q I QFL PDHINI 
VLKDHQFLEGKLYSSIDHESSTWIIKESNSLEISLIKKNEGLTW 
PE LV I GDKQGEL I RD S AQCAAI AE RLMHLTS EE LNPNPD KE KPP 
CNAQELEECDIFFEESSSLCRFDGNTLKTTHWNLGSNQYLFSV 
IVDPKEMPCFCLRHDVDALLWQPHSSKQDDMWEHIATFNALGYV 
QAS KRDKKFFACAPNYS YAALCBCLRRVF I YRQPAPMSTVXiYNR 
KEGRQVGQVAKQQVASLETNDP ILGFQATNERLFVLTTKNLFLI 
KVNTEN 


6876 


41 


1285 


VGEMTLIWRHLLRPLCLVTSAPRILEMHPFLSLGTSRTSVTKLS 
LHTKPRMPPCDFMPERYQVIFLWSGSEANELAMLMARAHSNNI 
DIISFRGAYHGCSPYTLGLTNVGIYKMELPGGTGCQPTMCPDVF 
RGPWGGSHCRDSPVQTIRKCSCAPDCCQAKDQYIEQFKDTLSTS 
VAKS IAGFFAE P I QG VNG WQYP KG FLKE AFE LVRARGG VC I AN 
EVQTGFGRLGSHFWGFQTHDVLPDIVTMAKGIGNGFFMAAVITT 
PEIAKSLAKCLQHFNTFGGNPMACAIGSAVLEVIKEENLQENSQ 
EVGTYMLLKFAKLRDEFEIVGDVRGKGLMIGIEMVQDKISCRPL 
PREEVNQIHEDCKHMGLLVGRGSIFSQTFRIAPSMCITKPEVDF 
AVEVFRSALTQHMERRAK 


6877 


1 


778 


GTSPSPARAYAPPTERKRFYQNVSITQGEGGFEINLDHRKLKTP 
QAKLFTVPSEALAIAVATEWDSQQDTIKYYTMHLTTLCNTSLDN 
PTQRNKDQLIRAAVKFLDTDTICYRVEEPETLVELQRNEWDPII 
EWAE KRYGVE I S S S TS I MG P S I P AKTRE VLVS HliAS YNT WALQG 
I E F VAAQLKS MVLTLGL I D LRLT VEQ AVLLS RLE EE YQ I Q KWGN 
IEWAHDYELQELRARTAAGTLFIHLCSESTTVKHKLLKE 


6878 


931 


263 


QTLQGDFKNRAEMIDFWlRIKNVTRSnAGKYRCEVSAPSEQGQN 
LEEDTVTLEVLVAPAVPSCEVPSSALSGTWELRCQDKEGNPAP 
EYTWFKDGIRLLENPRLGSQSTNSSYTMNTKTGTLQFNTVSKLD 
TGEYSCEARNSVGYRRCPGKRMQVDDLNISGI IAAWWALVIS 
VCGLGVC YAQRKGYFSKETS FQKSNS SSKATTMS ENDFKHTKSF 
II 


6879 


3 


845 


IRVIGESDIMQEFLSESDENYNGVSDVELRVALPJDGTTVTVRVK 
KNSTTDQVYQAIAAKVGMDSTTVNYFALFEVI S US FVRKLAPNE 
FPH KL Y I QNYTS AVPGTCLT I RKWLFTTEEE I LLNDNDLAVTY F 
FHQAVDDVKKGYI KAEEKS YQLQKLYEQRKMVMYLNMLRTCEGY 
NE 1 1 F PHCACDS RR KGHV I TAI S I THFKLHACTE EGQL ENQVI A 
FEWDEMQRWDTDEEGMAFCFEYARGEKKPRWVKIFTPYFNYMHE 
CFERVFCELKWRKEEY 


6880 


2110 


1437 


RKDNCTAKE W TF P EAKWNTTARVFSH I RLGMGHVL 1 1 VQ C F I S S 
MAN I YNE K I LKEGNQLTE S I F IQNS KL YFFG I L FNGLTLGLQRS 
NRDQI KNCGFFYGHRAFSVALI FVTAFQGLSVAF ILKFLDNMFH 
VLMAQ VTTVI I TTVS VL VFD FRPS LE FFLEAP S VLLS I F I YNAS 
KPQVPEYAPRQERIRDLSGNLWERSSGDGEELERLTKPKSDESD 
EDTF 


6881 


2638 


2244 


NDSKWEDIHVITGALKMFFRELPEPLFTFNHFNDFVNAIKQEPR 
QRVAAVKDLIRQLPKPNQDTMQILFRHLRRVIENGEKNRMTYQS 
I AI VFG PTLL KPE KE TGN I AVHTVYQNQ I VEL I LLE LS S I FGR 


6882 


1 


850 


GIPEAQLWIYPVKSCKGVPVSEAECTAMGLRSGNLRDRFWLVIN 
QEGNMVTARQEPRLVLISLTCDGDTLTLSAAYTKDLLLPIKTPT 
TNAVHKCRVHGLE I EGRDCGEATAQW I TS FLKSQP YRLVHFEPH 
MRPRRPHQIADLFRPKDQIAYSDTSPFLILSEASLADLNSRLEK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovm, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KVKATN FRPN I VI SGCD V YAEDS WDELL I GD VELKR VMACSRC I 
LTTVDPDTGVMSRKEPLETLKSYRQCDPSERKLYGKSPLFGQYF 
VLENPGTI KVGDPVYLLGQ 


6883 


2794 


2256 


NSKLKLNQNLKLFITLTYQVLSLHGWGPGIHLQKEGAFPVTQNR 
ALQLLYDLRyLNIVLTAKGDBVKSGRSKPDSRIEKVTDHLEALI 
DPFDLDVFTPHLNSNLHRLVQRTSVLFGLVTGTENQLAPRSSTF 
NSQE PHNI LP LASSQ I RFGLLPLSMTSTRKAKSTRNIETKAQYD 
ANC 


6884 


2 


99 


EFERVTAEAVKPRETSEPRAAAQRFCEKFPFL 


6885 


297 


1554 


STGQFWHVTDLHLDPTYH ITDDHTKVCASS KGANASNPG PFGDV 
LCDSPYQLILSAFDFIKNSGQEASFMIWTGDSPPHVPVPELSTD 
TVINVITNMTTTIQSLFPNLQVFPALGNHDYWPQDQLSVVTSKV 
YNAVANLWKPWLDEEAISTLRKGGFYSQKVTTNPJTLRI I SLNTN 
L YYG PN I MTLNKTDP ANQFE WLE S TLNNSQQNKE KVY 1 1 AHVPV 
GYLPSSQNITAMREYYNEKLIDIFQKYSDVIAGQFYGHTHRDSI 
MVLSDKKGSPVNSLFVAPAVTPVKSVLEKQTNNPGIRLFQYDPR 
D YKLLDMLQ Y YLNLTEANL KGES I WKLE Y ILTQTYD I E DLQP ES 
LYGLAXQFTILDSKQF I KYYNYFFVS YDSSVTCDKTCKAFQICA 
IMNLDNISYADCIiKQLYIKHNY 


6886 


2 


1341 


QCGG I PGREGGS SRPLEEGTGSSPAC VRGAAPGSEDAF Y PTRAK 
QARVSQELKKAAKRTVSISEGPDTLGDGMRERRETLALAPEPEP 
LEKEACEKWKR P FRS AS AT S LTLS HCVDWKGLLD FKKRRGHS I 
GGAPEQRYQI I PVCVAARLPTRAQDVLDAHLSEVNAVRFGPNSS 
LLATGGADRLIHLWNWGSRLEANQTLEGAGGS ITS VDFDPSGY 
QVLAAT YNQAAQLW KVGEAQS KETLS GHKDKVTAAKFKLTRHQA 
VTGSRDRTVKEWDLGRAYCSRTINVLS YCNDWCGDHI I ISGHN 
DQKIRFWD S RG PHCTQ V I P VQGRVTS LS LS HDQLHLLS CS RDNT 
LKVIDLRVSNIRQVFRADGFKCGSDWTKAVFSPDRSYALAGSCD 
GALYIWDVDTGKLESRLQGPHCAAVNAVAWCYSGSHMVSVDQGR 
KWLWQ 


6887 


1047 


116 


WTARPSQKPFWEAGAVPGDPLSTGCSQAQLGGCCPRGPWGPQHG 
GQQRAAGPTLPRGERGGPQQSGPGLAAQTPPTSKQVAWRAFLTG 
TYRSQ S PRS PAGPFRGGTG WWPE PAVCLCVAVGPQRLSSPGLVY 
NASGS EHC YD I YRL YH S CADP TGCGTGP DARAWD YQACTE I NLT 
FASNNVTDMFPDLPFTDELRQRYCLDTWGVWPRPDWLLTSFWGG 
DLRAASNIIFSNGNLDPWAGGGIRRNLSASVIAVTIQGGAHHLD 
LRASHPED PASWEARKLEAT I IGEWVKAARREQQPALRGGPRL 
SL 


6888 


1 


992 


FVAYVKKEIPHIWTHCLLNPHALVIKTLPTKLRDALFTWRVI 
NFIKGRAPNHRLFQAFFEEIGIEYSVLLFHTEMRWLSRGQILTH 
IFEMYEEINQFLHHKSSNLVDGFENKEFKIHLAYLADLFKHLNE 
LSASMQRTGMNTVSAREKLSAFVRKFPFWQKRIEKRNFTNFPFL 
EE 1 1 VSDNEG I F I AAE I TLHLQQ LS NF FHG YFS IGDLNEAS KW I 
LDPFLFNIDFVDDSYLMKNDLAELRASGQILMEFETMKLEDFWC 
AQFTAFPNLAKTALE I LMPFATTYLCELGFS I TFTFQNKVPEAA 
LILSDDIRVAISKKVPSFLGHH 


6889 


1 


1534 


LTLE NQ I KE EREQDNS ES PNGRTS P L VS QNNEQGS TLRDLLTTT 
AGKLRVGSTDAGIAFAPVYSMGAPSSKSGRTMPNILDDIIASW 
ENKIPPSKTSKINVKPELKEEPEESIISAVDENNKLYSDIPHSW 
ICEKHILWLKDYKNSSNWKLFKECWKQGQPAWSGVHKKMKISL 
WKAES ISLDFGDHQADLLNCKDS I ISNANVKEFWDGFEEVSKRQ 
KNKSGETWLKLKDWPSGEDFKTMMPARYEDLLKSLPLPEYCNP 
EGKFNLASHLPGFFVRPDLGPRLCSAYGWAAKDHDIGTTNLHI 
EVSDWNILVYVGIAKGNGILSKAGILKKFEEEDLDDILRKRLK 
DS S E I PGALWHI YAGKDVDKI RE FLQKI S KEQGLE VLPE HD PIR 
DQS WYVNKKLRQRLLEE YGVRTWTL I QFLGDAI VLPAGALHQVQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
cor re sp ond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signai peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H«Hietidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine / N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 

W=Tryptophan, Y=Tyrosine, X^Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NFHSCIQVTEDFVSPEHLVESFHLTQELRIiLKEEINYDDKLQVK 
NILYHAVKEMVRALKIHEDEVDDMEEN 


6890 


3 


667 


1 nAUjMW 1 J?JjI LiHRAL V VHJVrAETCINIS PPCGAKDSLI FGAITCF 
TGFLGVDTGAGATRWCRIiKTQRADPLVCAVGMLGSAI FI CLI FV 
AAKS S I VGAYI C I FVGETLLFS NWAI TAD I LMYWI P TRRATAV 
ALQSFTSHLLGDAGSPYLIGFISDLIRQSTKDSPLWEFLSLGYA 

LMIiCPFVWLGGMFFLATALFFVSDRARAEQQVNQLAMPPASVK 
V 


6891 


1980 


1262 


LRIHQELLSKELKLLRGITIESI IHIGLAAGKEQFMQDASNVMQ 
LLLKTQSHLYNMEDNNPEVRQAAAYGLGVMAQFGGDDYRSLCSE 
AVPLLVKVIKRAHSKTKKNVTATENCISAIGKILKFKPNCVNVD 
E VLPHWLS WL PLHED KE EAI QTLS FLCDL I E SNHP W I G PNNSN 
LPKI ISIIAEGKINETINYEDPCT^KRLANWRQVQTSEDLWLEC 
VSQLDDEQQEALQELLNFA 


6892 


3 


876 


RSVAAASGPGAWGTDHYCLELLRKRDYEGYLCSLLLPAESRSSV " 
FALRAFNVELAQVKDSVSEKTIGLMRMQFWKKTVED1YCDNPPH 
QPVAI ELWKAVKRHNLTKRWLMKI VDEREKNLDDKAYRN I KELE 
N YAENTQSSLL YLTLE I LG I KDLHADHAAS H I GKAQG I VTC LRA 
TPYHGSRRKVFLPMDICMLHGVSQEDFLRRNQDKNVRDVIYDIA 
SQAHLHLKHARS FHKTVP VKAFPAFLQTVSLEDFLKKI QRVDFD 
IFHPSLQQKNTLLPLYLYIQSWRKTY 


6893 


1 


842 


DGERKSMSVERTFSEINKAEEQYSLCQELCSELAQDLQKERLKG 
RTVTI KLKNVNFEVKTRASTVSSVVSTAEEI FAI AKELLKTEID 
ADFPHPLRLRLMGVRISSFPNEEDRKHQQRS I IGFLQAGNQALS 
ATECTLEKTDKDKFVKPLEMSHKKSFFDKKRSERKWSHQDTFKC 
EAVN KQS FQT S Q PFQ VLKKKMNENL E I S ENS DDCQI LT C P VCFR 
AQGCISLEALNKHVDECLDGPSISENFKMFSCSHVSATKVNKKE 
NVPASS LCEKQD YEAH 


' 6894 1 


1742 


1463 


TTLCKPLVPREHQFYETLPAEMRKFTPQYKGKSQLLEGLPHWRG 
DVRDRGHGRPWQPSLEPSLPPTLCFPSLSSFSSSWPSAQHLTPS 
VFNPW 


6895 


2379 


478 


VT Y VE L CDLAS PTALL IMRT VLDL I VEDLQS TS EDKEQQ YTS QT 
TRLIALLYAIASHKACKIAILHLINGTI KGDERYAE IFQDLLAL 
VRSPGDSVIRQQCVEYVTSILQSLCDQDIALILPSSSEGSISEL 
EQLSNSLPNKELMTS I CD CLLATLANSESSYNCLLTCVRTMMFL 
AEHD YGL FHLKS S LR KNS S ALHS LL KR WSTFS KDTGELAS SFL 
EFMRQILNSDTIGCCGDDNGLMEVEGAHTSRTMSINAAELKQLL 
Q S KE E S PENLFLEL E KLVLEHS KDDDNLDSLLDS WGLKQMLE S 
SGDPLPLSDQDVEPVLSAPESLQNLFNNRTAYVLADVMDDQLKS 
WNriV r tiAU, U, IDTDIjDLVKVD L I E LS E KCCS DFDLHS E LERS FL 
S E PSS PGRTKTTKGFKLGKHKHETF I TS SGKS E Y I E PAKRAHW 
PPPRGRGRGGFGQGIRPHDIFRQRKQNTSRPPSMHVDDFVAAES 
KEWPQDGIPPPKRPLKVSQKISSRGGFSGNRGGRGAFHSQNRF 
FTPPASKGNYSRREGTRGSSWSAQNTPRGNYNESRGGQSNFNRG 
PLPPLRPLSSTGYRPSPRDRASRGRGGLGPSWASANSGSGGSRG 
KFVSGGSGRGRHVRS FTR 


6896 


1 j 


555 


GN I VIQKKKYNKQH 1 1 PLENVT I DS I KDEGDLRNGWL I KTPTKS 
FAVYAATATEKSEPmNHINKCVTDLLSKSGKTPSNEHAAVWVPD 
SEATVCMRCQKAKFTPVNRRHHCRKCX3FWCX3PCSEKRFLLPSQ 
SSKPVRICDFCYDLLSAGDMATCQPARSDSYSQSLKSPLNDMSD 
DDDDDDSSD 


6897 


3 


920 


GDGLMHEVWGLMERPDWETAlQKPLCSLPAGSGNALAASLNHY " 
AGYEQVTNEDLLTNCTLLLCKRLLSPMNLLSLHTASGLRLFSVL 
S LAWG F I AD VDLESE KYRRLGEMR FTLGTFLRLAALRT YRGRLA 
YLPVGRVGS KTPASP WVQQGPVDAHLVPLEE P VPSHV7TWPDE 
DFVLVLALLHSHI^SEMFAAPMGRCAAGVMHLFYVRAGVSRAML 



568 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W= Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LRLFLAMEKGRHMEYECPYLVYVPVVAFRLEPKDGKGVFAVDGE 
LMVSEAVQGQVHPNYFWMVSGCVEPPPSWKPQQMPPPEEPL 


6898 


919 


346 


QKTVTAVASLLKGRQGI YTENERRMGAVI KIRFFKIMLVLI ICW 
LSN I INES LLFYLEMQTD I NGGSLKP VRTAAKTTWF I MGI LNPA 
QGFLLSLAFYGWTGCSLGFQSPRKEIQWESLTTSAAEGAHPSPL 
MPHENPASGKVSQVGGQTSDEALSMLSEGSDASTIEIHTASESC 
NKNEGD PALPTHGDL 


6899 


120 


827 


MKVRKNNDAYLLDKNKINMDCFISCFFKKMLTTLMFSKSGILSL 
LEHGEEYTFSLPCAYARS I LTVP WVE LGGKVS VNCAKTG YS AS I 
TFHTKPFYGGKLHRVTAEVKHNITNTWCRVQGEWNSVLEFTYS 
NGETKYVDLTKLAVTKKRVRPLEKQDPFESRRLWKNVTDSLRES 
E I DKATEHKHTLE ERQRTEERHRTETGTPWKT KY F I KEGDGWVY 
HKPLWKIIPTTQPAE 


4900 


3 


451 


TEVLGSKGIHELRSSTSALHHALEESASLLTMFWRAALPSTHIP 
VLPGKVGESTERELLELRTKVSQQEQLLQSTTEHLKNANQQKES 
MEQ F I VS QLTRTHD VLKKARTNLE VR KLLHQ S EAP S LS PTHHH P 
LADLVGDSWPALRFQEK 


6901 


1 


201 


DDNMVQRLETDFKMTLQQQSTLEQWAAWLDNVMMQALKPYEGRP 
SFPKAARQFLLKWSFYRYHLGFS 


6902 


2 


267 


GAP P P P P SQ P P RQP PQAAPS SHPHSDLT FNP S S ALEGQAGAQGA 
SDMPEPSLDLLPELTNPDELLSYLDPPDLPSNSNDDLLSLFENN 


6903 


1 


149 


RINQVYRQGPTGIHILVIDQMVQNFQDESCFLFSTVKAESSDGI 
HULK 


6904 


464 


2092 


MEASLPVSLSCVLACGDVEGKFDILFNRVQAIQKKSGNFDLLLC 
VGN F FGS TQDAE W EE YKTG I KKAP IQT YVLGANNQETVKYFQDA 
DGCELAENITYLGRKGIFTGSSGLQIVYLSGTESLNEPVPGYSF 
SPKDVSSLRMMLCTTSQFKGVDILLTSPWPKCVGNFGNSSGEVD 
TKKCG S ALVSS LATG LKPR YHFAALE KT YYERL P YRNH 1 1 LQEN 
AQHATR F I ALANVGN PEKKKYL YAFS I V PMKLMDAAELVKQ P PD 
VTENP YRKS GQE AS I GKQ I LAP VEES ACQ FFFDLNE KQGR KRSS 
TGRDS KS S PHPKQPRKPP QP PG PCWFCLAS PE VE KHL WNI GTH 
C YLALAKGGLS DDHVLILP IGHYQS WELSAEWEEVEKYKATL 
RRFFKSRGKWCWFERNYKSHHLQLQVI PVPISCSTTDDI KDAF " 
I TQAQEQQ I ELLE I PEHSD I KQ IAQPGAAYF YVELDTGEKLFHR 
IKKNFPLQFGREVLASEAILNVPDKSDWRQCQISKEDEETLARR 
FRBCD FE P YD FT LDD 


6905 


1 


226 


VS KTGE AET I T SH YL FALG VYRTLYL FNW I WR YHFEG F FDL I AI 
VAGLVQTVLYCDFFYLYITKVLKGKKLSLPA 


6906 


3 


611 


SYDDHNGHIDFITAASNLRAKMYSIEPADRFKTKRIAGKIIPAI 
ATTTATVSGLVALEM I KVTGG YPFEAYKNWFLNLAI PIWFTET 
TEVRKTKI RNGI S FT I WDRWTVHGKEDFTLLDFINAVKEKYG I E 
PTMWQGVKMLYVPVMPGWAKRLKLTMHKLVKPTTEKKYVDLTV 
SFAPDIDGDEDLPGPPVRYYFSHDTD 


6907 


2 


2228 


LRG VP VWAAGAFR FS S G EE S TS HL I MS RRS QRLTR YS QGDDDGS 
SSSGGSSVAGSQSTLFKDSPLRTLKRKSSNMKRLSPAPQLGPSS 
DAHTSYYSESLVHESWFPPRSSLEELHGDANWGEDLRVRRRRGT 
GGSESSRASGLVGRKATEDFLGSSSGYSSEDDYVGYSDVDQQSS 
S SRLRSAVSRAGS LLWMVATS PGRLFRLLYWWAGTTWYRLTTAA 
SLLDVFVLTRRFS SLKTFLW FLLPLLLLTCLTYGAWYFYP YGLQ 
T FH PALVS WWAAKDS RRADEGWEARDS S PHFQAEQRVMS RVHS L 
ERRLEALAAEFSSNWQKEAMRLERLELRQGAPGQGGGGGLSHED 
TLALLEGLVSRREAALKEDFRRETAARIQEELSALRAEHQQDSE 
DLFKKI VRASQESEAR I QQLKSE WQSMTQES FQESSVKELRRLE 
DQLAGLQQELAALALKQSSVAEEVGLLPQQIOAVRDDVESQFPA 
WISQFLARGGGGRVGLLQREEMQAQLRELESKILTHVAEMQGKS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
SsSerine, T^Threonine , V«Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ARE AAAS LS LT LQKEGVI GVTE EQ VHH I VKQ ALQR YS E DR I GLA 
DYALESGGASVISTRCSETYETKTALLSLFGIPLWYHSQSPRVI 
LQPDVHPGNCWAFQGPQGFAWRLSARIRPTAVTLEHVPKALSP 
NSTISSAPKDFAIFGFDEDLQQEGTLLGKFTYDQDGEPIQTFHF 
QAPTMATYQVVELRILTNWGHPEYTCIYRFRVHGEPAH 


6908 


3 


780 


QVPSAAWLMAVCGLGSRLGLGSRLGLQGCFGAARLLYPRFQSRG 
PQGVEDGDRPQPSSKTPRIPKIYTKTGDKGFSSTFTGERRPKDD 
QVFEAVGTTDELSSAIGFALELVTEKGHTFAEELQKIQCTLQDV 
GSALATPCSSAREAHLKYTTFKAGPILELEQWIDKYTSQLPPLT 
AF I LPSGGKI SSALHFCRAVCRRAERRWPLVQMGETDANVAKF 
LNRLSDYLFTLARYAAMKEGNQEKIYKKNDPSAESEGL 


6909 


3 


409 


GRLLAVGTDLYGQRSSAPEQELLVQDATPVSNSLLPEKAFSDIP 
SPYLRGTI KMMQAVRQAFQDQDDRRTWDGRPLTMAATFDDCLYA 
LCWDT I KRS SQTGE WQN I AIMTE E P ELS PAYL I S EAMRRS RMS 
LYC 


6910 


1 


1068 


L V P WVI D S Y Y YGKLV I AP LN I VL YN I FT PHG P DL YGTE P W YF Y 
LINGFLNFNVAFALALLVLPLTSLMEYLLQRFHVQNIjGHPYWLT 
LAPMYIWFIIFFIQPHKEERFLFPVYPLICLCGAVALSALQHSF 
LYFQKCYHFVFQRYRLBHYTVTSNWLALGTVFLFGLLSFSRSVA 
LFRGYHGPLDLYPEFYRIATDPTIHTVPEGRPVNVCVGKEWYRF 
PS S FLLPDNWQLQF IPSE FRGQLP KP FAEGPLATRI VPTDMNDQ 
NLEE PSRY I D I S KCHYLVDLDTMRETPREPKYS SNKEEW I SLAY 
RPFLDASRS S KLLRAFYVPFLSDQ YTVYVNYT I LKPRKAKQ IRK 
KSGG 


6911 


1184 


966 


GEDAEEMETGNVANLISIFGSSFSGLLRKSPGGGREEEEGEESG 
PEAAEPGQICCDKPVLRDMNPWSTAIVAF 


6912 


1 


844 


AMKP VETHS FQMLFT I LSTGS ALKAQ S YEDAYRC I KSS I LLGS I 
SGGTD I I S C FMGHNFS L P VYKGE I QARNLGMAVEAWNEEGKAVW 
GESGELVCTKP I PCQPTHFWNDENGNKYRKAYFS KFPGI WAHGD 
YCRINPKTGGIVMLGRSDGTLNPNGVRFGSSEIYNIVESFEEVE 
DSLCVPQYNKYREERVILFLKMASGHAFQPDLVKRIRDAIRMGL 
SARHVPSLILETKGIPYTLNGKKVEVAVKQI 1AGKAVEQGGAFS 
NPETLDLYRDI PELQGF 


6913 


1643 


. 1558 


KKSHEESHKEELSYGAQASLPLPCSDFR 


6914 


1251 


615 


ELAAECKSAG Y PGTLI P YRCDLSNEED ILSMFSAI RSQHSGVDI 
CINNAGLARPDTLLSGSTSGWKDMFNVNVLALS ICTREAYQSMK 
ERNVDDGH 1 1 NINS MS GHRVLPLS VTHF YS AT KYAVTALTEGLR 
QELREAQTH I RATC I S PG WETQFAFKLHDKD P E KAAAT YE QMK 
CLKPEOVAEAVIYVXiSTPAHIQIGDIQMRPTEQVT 


6915 


254 


652 


GRSLS FKTFL I WVLIS I YQGGILM YGALVLFES EFVHWAI S FT 
ALI LTELLMVALTVRTWHWLMVVAEFLSLGCYVSS LAFLNE YFD 
VAF ITTVTFLWKVSAIT WS CLPLYVLKYLRRKLS P PS YCKLAS 


6916 


254 


652 


GRSLS FKTFLIWVLISIYQGGILMYGALVLFESEFVHWAISFT 
ALI LTELLMVALTVRTWHWLMWAEFLSLGCYVSS LAFLNE YFD 
VAFITTVTFLWKVSAITVVS CLPLYVLKYLRRKLS P PS YCKLAS 


6917 


254 


652 


GRSLS FKTFLIWVLISIYQGGILMYGALVLFESEFVHVVAISFT 
ALI LTELLMVALT VRTWHWLM WAE FLS LGCYVS S LAFLNE YFD 
VAF ITTWFLWKVSAITWS CLPLYVLKYLRRKLS P PS YCKLAS 


6918 


28 


921 


PEAGTRSWREPDPEDLRRFLLSAACRSFPQWLPGGGGGQVSSCS 
DTDVPYLLLAVKSEPGRFAERQAVRETWOSPAPGIRLLFLLGSP 
VGEAGPDLDSLVAWESRRYSDLLLWDFLDVPFNQTLKDLLLLAW 
LGRHC PT VS F VLRAQDDAFVHTPALLAHLRALP PAS ARS L YLG E 
VFTQAMPLRKPGGPFYVPES FFEGGYPAYASGGGYVIAGRLAPW 
LLRAAARVAPFPFEDVYTGLCIRALGLVPQAHPGFLTAWPADRT 
ADHCAFRNLLLVRPLGPQAS IRLWKQLQD PRLQC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B« 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, V°Valine, 
W=Tryptophan, Y«Tyrosine, X=>Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6919 


850 


41 


QGRRELSGSVFCPFIQQEPKEMLTLSEYHERVRSQGQQLQQLQA 
E LD KLH KE VS TVRAANS ERVAKLVFQRLNEDFVRKPD YALSS VG 
AS IDLQKTSHDYADRNTAY FWNRFS FWNYARP PTVI LE PHVFPG 
NCWAFEGDQGQ W I QLPGR VQ LS D I TLQHP P P S VEHTGG ANSAP 
RDFAVFFLLSFFTHQGLQVYDETEVSLGKFTFDVEKSEIQTFHL 
QNDP PAAFPKVKIQ I LSNWGH PRFTCLYRVRAHGVRTSEGAEGS 
AQGPH 


6920 


1418 


591 


EAQGPSKVHLTLKKKK 


6921 


2 


1711 


MNATRSEEQFHVINHAEQTLRKMENYLKEKQLCDVLLIAGHLRI 
PAHRLVLSAVSDYFAAMFTNDVLEAKQEEVRMEGVDPNAIjNSLV 
QYAYTGVLQLKEDTI ESLLAAACLLQLTQVIDVCSNFLI KQLHP 
SNCLGIRSFGDAQGCTEIiLNVAHKYTMEHFIEVIKNQEFLLLPA 
NEISKLLCSDDINVPDEETIFHALMQWVGHDVQNRQGEIiGMLLS 
YIRLPIiLPPQLLADLETSSMFTGDLECQKLLMEAMKYHLLPERR 
SMMQSPRTKPRKSTVGAL YAVGGMDAMKGTTT I E KYDLRTNSWL 
HIGTMNGRRLQFGVAVIDNKLYWGGRDGLKTLNTVECFNPVGK 
IWTVMPPMSTHRHGLGVATLEGPMYAVGGHDGWSYLNTVERWDP 
EGRQWNYVAS MSTPRSTVG WALNNKLYAIGGRDGSS CLKS MEY 
FD PHTNKWS LiCAPMS KRRGGVGVAT YNG FL Y WGGHDAPAS NHC 
SRLSDCVERYDPKGDSWSrVAPLSVPRDAVAVCPLGDKLYWGG 
YDGHT YLNTVES YDAQRNE WKEEVP VNIGRAGACVVVVKL P 


6922 


1075 


369 


LTPPAGIRHEVRDREREREREREREKFPLDSTGSELKQNIHSIT 
GLP PAMQKVM YKGLAPEDKTLRE I KVTSGAK I MGGGSTI ND VLA 
VNT P KDAAQQ DAKAE ENKKE P LCRQKQHRKVLD KG KP ED VM P S V 
KGAQERLPTVPLSGMYNKSGGKVRLTFKLEQDQLWIGTKERTEK 
LPMGS 1 KNWSEPIEGHEDYHMMAFQLGPTEAS YYWVYWVPTQY 
VDAI KDTVLGKWQYF 


6923 


2469 


1660 


LGLFCILP IDTLCAVLERDTLS IRESRLFGAWRWAEAECQRQQ 
LP VTFGNKQKVLGKALS LI RF PLMTI EEFAAGPAQSGI LSDREV 
VNLFLHFTVNPKPRVEYIDRPRCCLRGKECCINRFQQVESRWGY 
SGTSDRIRFTVNRRIS I VGFGLYGS IHGPTDYQVNIQI I EYEKK 
QTLGQNDTGFSCDGTANTFRVMFKEPIEILPNVCYTACATLKGP 
DSHYGTKGLKKVVHETPAASKTVFFFFSSPGNNNGTSIBDGQIP 
EIIFYT 


6924 


2210 


1235 


PEERVICFVEYYLTAFHEGRKGALAKKPYNPI IGETFHCSWEVP " 
KDRVKPKRTASRS PAS CHE HPMADD P S KS YKLRFVAEQVSHH P P 
I S CF YCECEE KRLCVNTHVWTKS KFMGMS VGVS M I GEGVLRLLE 
HGEE YVFTLPS AYARS I LT I PW VELGGKVS INCAXTG YS AT V I F 
HTKPFYGGKVHRVTAEVKHNPTNTIVCKAHGEWNGTLEFTYNNG 
ETKVIDTTTLPVYPKKIRPLEKQGPMESRNLWREVTRYLRLGDI 
DAATEQKRHLEEKQRVEERKRENLRTPWKPKYFIQEGDGSGILQ 
SPLESTLMGLEVQSFPV 


6925 


2 


1653 


RGGAAGAAMEPDSVIEDKTIELMCSVPRSLWLGCANLVESMCAL 
SCLQSMPSVRCLQISNGTSSVIVSRKRPSEGNYQKEKDLCIKYF 
DQWS E S DQ VE FVEHLI S RM CHYQHGH I NS YLKPMLQRDF I TAL P 
EQGLDH IAEN I L S YLDARS LCAAE LVCKEWQRVI S EGMLWKKL I 
ERM VRTDPLWKGLS ERRGWDQ YLF KNRPTDGP PNS F YRS L Y PKI 
IQD I ET I ESNWRCGRHNLQR I QCRS ENS KGVYCLQ YDDEKI I SG 
LRDNSIKIWDKTSLEOjKVLTGHTGSVLCIjQYDERVIVTGSSDS 
TVR VWD VNTG EVLNT L I HHNEAVLHL R F S NGL MVTC S KDRS IAV 
WDMASATDITLRRVLVGHRAAVNWDFDDKYIVSASGDRTIKVW 
STSTCEFVRTLNGHKRGIACLQYRDRLWSGSSDNTIRLWDIEC 
GACLRVLEGHEELVRCIRFDNKRIVSGAYDGKIKVWDLQAALDP 
RAPASTLCLRTLVEHSGRVFRLQFDEFQIISSSHDDTILIWDFL 
NVPP S AQNETRS PS RTYTY I SR 


6926 


1 


733 


SGRVAMDGLGLQFPEQGFPAGPPLLPPHMGGHYRDCQSLGAPPL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
L= Leucine , M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W= Tryptophan, Y=Tyrosine, X "Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








DGYPLPTPDTSPLDGVDPDPAFFAAPMPGDCPAAGTYSYAQVSD 
YAGP PEP PAGPMHPRLGPEPAGPS I PGLLAP P S ALHVY YGAMGS 
PGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPTPPPEALPCRDGT 
DPSQPAELLGEVDRTEFEQYLHFVCKPEMGLPYQGHDSGVNLPD 
SHGAISSWSDASSAVYYCNYPDV 


5927 


2 


1484 


LTLCGDIQLMLAQNANNRAAHLEEFHYQTKEDQEILHSLHRESS 
CQG FAWATDI/S TDLESQLSVS CKC YEAANE I LQFRDLKSQNPEH 
YVQVL KRMGN IRNE I G VF YMNQAAALQS E RLVS KSVSAAEQQLW 
KKS FS CFEKG IHNFES I EDATNAALLLCNTGRLMRICAQAHCGA 
GDELKREFSPEEGLYYNKAIDYYLKALRSLGTRDIHPAVWDSVN 
WELS TT YFTMATLQQD YAPLSRKAQE QIE KEVS EAMMKS LK YCD 
VDSVSARQPLCQYRAATIHHRLASMYHSCLRNQVGDEHLRKQHR 
VLADLHYSKAAKL FQLLKDAPCE LLR VQLERVAFAE FQMTS QNS 
NVGKL KTLSGALDIM VRTEHAFQL IQKELIEE FGQPKSGDAAAA 
ADASPSLNREEVMKLLSIFESRLSFLLLQSIKLLSSTKKKTSNN 
IEDDT I LKTNKH I YSQLLRATANKTATLLERINV I VHLLGQLAA 
GSAASSNAVQ 


6928 


1086 


77? 


EAI DL I NNLLQ VKMRKR YS VDKTLSHP WLQD YQTWLDL R ELE CK 
IGERYITHESDDLRWEKYAGEQGLQYPTHLINPSASHSDTPETE 
ETEMKALGERVSIL 


€929 


1749 


607 


RDQRGYRDDRSPAREPGDVSARTRSGGGGGRSATTAMPPPVPNG 
NLHQHDPQDLRHNGNVWAGRPSCSRGPRRAIQKPQPAGGRRSG 
RGPAAGGLCLQPPDGGTCVPEEPPVPPMDWEALEKHLAGLQFRE 
QEVRNQGQARTNSTSAQKNERESI RQKLALGS FFDDGPG I YTS C 
SKSGKPSLSSRLQSGMNLQICFVNDSGSDKDSDADDSKTETSLD 
TPLS PM S KQS S S YS DRDTTE E ESE S LDDMDFLTRQ KKLQAEAKM 
ALAMAKPMAKMQVEVEKQNRKKSPVADIiLPHMPHISECLMKRSL 
KPTDLRDMTIGQLQVI VNDLHS QI ES LNEELVQLLL IRDELHTE 
QDAMLVDIEDLTRHAESQQKHMAEKMPAK 


6930 


131 


545 


FKDTANVFVSLFQMRNNFRHYFIEPSQLIOiFYDVITWIVTQVAI 
S YTWP FVLLS I KP S LTFYS S W Y YCLH ILGILVLLLLP VKKTQR 
RKNTHENIQLSQSKKFDEGENSLGQNSFSTTNNVCNQNQEIASR 
HSSLKQ 


6931 


2 


659 


FV KKij PNRPACLL VAS GAAEGVS AQS FLHCFTMAS TAFNLQVAT 
PGGKAME FVDVTE SNARWVQ D FRLKAYAS PAKLES IDGAR YHAL 
LI PSCPGAIiTDLASSGSLARILQHFHSES KPICAVGHGVAALCC 
ATNEDRSWVFDSYSLTGPSVCELVRAPGFARLPLWEDFVKDSG 
ACFSAS E PDAVHWLDRHL VTGQNAS S TVPAVQNLLFLCGSRK 


6932 


2 


1131 


FVDS PG QG EQAEEE EGG IQMNS RMRAHS PAEGAS VE S S S PGP KK 
SDMCEGCRSLAAGHPGYISHDKETSI KYVSHQHPSHPQLFS I VR 
QACVRSLSCEVCPGREGPIFFGDEQHGFVFSHTFFIKDSLARGF 
QRWYS I ITIMMDRI YLINS WPFLLGKVRG I IDELQGKALKVFEA 
EQFGCPQRAQRMNTAFTPFLHQRNGNAARSLTSLTSDDNLWACL 
HTSFAWLLKACGSRLTEKLLEGAPTEDTLVQMEKLADLEEESES 
WDNSEAEEEEKAPVLPESTEGRELTQGPAESSSLSGCGSWQPRK 
LPVFKSLRHMRQVGGRGTAHHELRRRANHGLCLPTRLASGPSTL 
KTLQEVTDS LLGG WLMAQG VGG 1 1 


6933 


1431 


890 


SLNLHCTLP P PPHQ YPAGYP S D KEGKKP KGQS KKQ PS GTTKRP I 
SDDDCPSASKVYKASDSAEAIEAFQLTPQQQHLIREDCQNQKLW 
DEVLSHLVEGPNFLKKLEQSFMCVCCQELVYQPVTTECFHNVCK 
DCLQRSFKAQVFSCPACRHDLGQNYIMIPNEILQTLLDLFFPGY 
SKGR 


6934 


3030 


2586 


DRDHSQCGGIRRVALARVSSVKLISKAKIRTVKMTFI IVLAFIV 
CWTPFFFVQMW3VWDANAPKEASAFI I VMLLASLNSCCNPWIYM 
LFTGHLFHELVQRFLCCSASYLKGRRLGETSAS KKSNSS S FVLS 
HRSSSQRSCSQPSTA 
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ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

duiJ.nu aClu 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

LcSluUc OI 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, CoCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, KsLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
o— oenne, isinreonine, vsvaiine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 


6935 


886 


543 


NSALYVAGGNDGTSCLNSVERYSPKAGAWESVAPMNIRRSTHDL 
VAMDGWLYAVGGNDGSSSLNSIEKYNPRTNKWVAASCMFTRRSS 
V<j VAV lit ijbN rP r PS S PTIjS V b £3 1 b 1j 


6936 


1347 


567 


R SHRRQ FLSRALLE F FGKSH P P PHRL FRKS LNVG LHYS H I P FLT 
TCLHFLRKRLQKGEVGLSVETSKPQVPVGGLSRKKVPQEPWATV 
ME KRLQEAQ L YKEEGNQRYREGKYRDAVS R YHRALLQLRGLD P S 
LPS PL PNLG P QG PALT PEQEN I LHTTQTD CYNNLAACLLiQME P V 
NYERVREYSQKVLERQPDNAKALYRAGVAFFHLQDYDQARHYLL 
AAVNRQ P KD ANVRR YLQLTQS ELSS YHRKE KQLYLGM FG 


6937 


1 


727 


AVEFRCCPGRDPACFARGWRLDRVYGTCFCDQACRFTGDCCFDY 
DRACPARPCFVGEWSPWSGCADQCKPTTRVRRRSVQQEPQNGGA 
P C P PLEERAGCLE YS T PQGQD CGHTYVP AF I TTSAFNKERTRQA 
TSPHWSTHTEDAGYCMEFKTESLTPHCALENRPLTRWMQYLREG 
YTVCVDCQPPAMNSVSLRCSGDGLDSDGNQTLHWQAIGNPRCQG 
TWKKVRRVDQCSCPAVHSFIFI 


6938 


3 


719 


NSRKLELAERVDTDFMQLKKRRQSSEKENDSGTLDTVGAWVDH 
EGNVAAAVS SGGLALKHPGRVGQAALYGCGCWAFOTGAHNPYST 
AVSTSGCGEHLVRTILARECSHALQAEDAHQALLETN3QNKFISS 
P FLAS E DG VLGGVI VLRS CRCS AE P DS S QNKQTLLVE FLWSHTT 
ESMCVGYMSAQDGKAKTHISRLPPGAVAGQSVAIEGGVCRLGEP 
SELTLOAECEASORHFRT 


6939 


3 


810 


KVTAPRRP QR YSSGHG S DNS S VLSG ELP PAMGRTAL FHHS GGS S 
G YES LRRD S E ATGSAS S APDS MS ESGAAS PG ARTRS LKS P KKRA 
TG LQRRRL I P APLPDTTALGRKP S L PGQ WVDLP P PLAGS L KE P F 
E I KVYE I DDVERLQRPRPTPREAPTQGLACVS TRLRLAERRQQR 
LREVQAKHKHLCEELAETQGRLMLEPGRWLEQFEVDPELEPESA 
EYLAALERATAALEQCVNLCKAHVMMVTCFDISVAASAAI PGPQ 
EVDV 


6940 


1188 


496 


GKMAAQPLRHRSRCATPPRGDFCXSGTERAIDQASFTTSMEWDTQ 
WKGSS PLG PAGLGAEEPAAGPQLPS WLQ PERCAVFQCAQCHAV 
IjADSVHLAWDLSRSLGAVWSRVTNiWVLEAPFLVGIEGSLKGS 
TYNLLFCGSCGIPVGFHLYSTHAALAALRGHFCLSSDKMVCYLL 
KTKAIVNASEMDIQNVPLSEKIAELKEKIVLTHNRLKSLMKILS 
BVTPDQSKPEN 


6941 


1 


713 


SLSRADSDPHGPHTCGHVLNVI I GS NVLALAEAQRQAE ALG YQA 
WL SAAMQGDVKSMAQ F YGLIiAHVARTRLTP SMAGASVEEDAQL 
HELAAELQ I PDLQLEEALETMAWGRGPVCLLAGGEPTVQLQGSG 
RGGRNQELALRVGAELRRWPLGPIDVLFLSGGTDGQDGPTEAAG 
AWVTPELASQAAAEGLDIATFLAHNDSHTFFCCLQGGAHLLHTG 
MTGTNVMDTHLLFLRPR 


6942 


1 


246 


GD YVE RYD P KTDTWTMGAPLSMPTNAVGG CLLGDRL YADGG YDG 
QTYLNTMES YD PQTNE WTQMASLN I GRAGAC WV I KQ P 




J. 




DMaTrtnriLVTT.a TIJVIf .TaOQTTi T TWIT li^T.TJJV CGITDT. GMT. T3TJ"2 

HS PAGGS ITETLVQGDKTE YLLTALEPKPTY 1 1 CMVTMET TNAY 
VADETPVCAKAETADSYGPTTTLNQEQNAGPMASLPLAGI IGGA 
VALVFLFLVLGAI CW YVHQAGELLTRE RAYNRGS RKKDD YMESG 
TKKDNSILBIRGPGLQMLPINPYRAKEEYWHTIFPSNGSSLCK 
ATHT I G YGTTRG YRDGG I PD I DYS YT 


6944 


960 • 


156 


VANILLNGVKYESELTGSSERAEQPLSVGRLCSTICNMPKALRT 
LCVNHFLGWLSFEGMLLFYTDFMGEWFQGDPKAPHTSEAYOKY 
NSGVTMGCWGMCIYAFSAAFYSAILEKLEEFLSVRTLYFIAYLA 
FGLGTGLATLSRNLYWLSLCITYGILFSTLCTLPYSLLCDYYQ 
SKKFAGSSADGTRRGMGVD I S LLS CQYFLAQ I LVSLVLGPLTS A 
VGSANGVMYFSSLVSFLGCLYSSLFVIYEIPPSDAADEEHRPLL 
LNV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S»Serine, T=Threonine, V* Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 


6945 


2067 


179 


BGEDRGLPRTMGAAIiGTGTRLAPWPGRACGALPRWTPTAPAQGC 
HSKPGPARPVPLKKRGYDVTRNPHLNICT1AFTLEERLQLGIHGL 
IPP C FLSQDVQLiLR IMRYYERQQSDLDKYI I LMTLQDRNE KLFY 
RVLTSDVEKFMP I VYTPTVGLACQHYGLTFRRPRGLFI TIHDKG 
HIATMLNSWPEDNIKAVWTDGERILGLGDLGCYGMGIPVGKLA 
LYTACGGVNPQQCLPVLLDVGTNNEELLRDPLYIGLKHQRVHGK 
AYDDLLDE FMQAVTDKFG INCL I QFEDFANANAFRLLNKYRNKY 
CM FNDD I QGTAS VAVAG ILAALRI TKNKLSNHVFGFQGAGEAAM 
G\IAHLLVMALE\KEGVPKA\EATRKIW\MVDF\KGLIVQGRDH 
LNHEKEMFAQD\HPEVNSLEEWRLVKPTAI IGVAAIAEA\ FTE 
Q I LRDMASFHERP\ 1 1 FALSNPTSKAECTA\EKCYRVTEG PRGF 
FAS \ GS PF * GVL I WEMGKTFI PGGRGNNA*RVPRG WQLGVHSPG 
GDPGHIP\DEIFLPDSRAKLPQEVSEQHLSQGRLYP\PLST\IR 
NVFLRIAIKVFD*GYKHNLV\SYYPEPKD\KEAFCKIPGSYTPD 
YDS FYT/VDS Y I WAQGKAMNVQTV 


6946 


133 


2551 


SCEYSGITVAPGDPCPGVAHLLAPSMASDTPESLMALCTDFCLR 
NLDGTLGYLLDKETLRLHPD I FLPS EI \CDRLVNE YVELVNAAC 
NF\EPHE\SFFNPLFRDPRKQPASRRIHL\RED\liVQD\QD\LE 
AIRKQDL \VEL \ YLTN\ CE KLSAKSLQTLRS FSHTLGVP* AFFG 
C\TNILLLRKENPGGL/CEDEYLFNPTCQVLVKDFTFEGFSRIiR 
F\LKLGRMIDWVPVES\LLRPLNSLAALDLSGIQTSDAA\FLTQ 
WKDSL\VSLVL\YNMDLSDDHIR\VIVQLHKLRHLDISRDRLSS 
Y YKF KLTRE VLS LFVQ KLGNLMSLD I SG \HMI LENC S I S KI GKR 
EAGQTSI\EPSK\SSIIPFRGFEGGPLQF\LGVF*GIFCGRLTH 
I PAYKVSGDKNEEQVLNAI EAYTEHRPE I TSRAINLLFDI ARI E 
RCNQLLRALKL V I TALKCHKYDRN I QVTGS AALF YLTNS E YRS E 
QS VKLRRQVI Q WLNGMES YQE VT VQRNCCLTLCN FSIPEELEF 
QYRRVNELLLS ILNPTRQDES IQR1AVHLCNALVCQVDNDHKEA 
VGKMG FWTML KLTQKKLLDKTCDQ VME FS W\ SAL WNT TD ET PD 
NCEMFLNFNGMKLFLDCLNEFPEKQELHRNMLGLLGNVAEVKEL 
RPQLMTSQFISVFSNLLESKADGIEVSYNACGVLSHIMFDGPEA 
WGVCEPQREEVEERMWAAI QS WDINSRRNINYRS FE P ILRLLPQ 
GI S P VSQHWATWALYNLVS VYPDKYCPLL I KEGGMPLLRDI I KM 
ATARQETKEMARKVIEHCSNFKEENMDTSR 


1 6947 


2 


1682 


TSVSTI PRGLAS AR PQSRS WRCCPVWRRS PGRARGRGLKMLNVP 
SQSFPAPRSQQRVASGGRSKVPLKQGRSLMDWIRLTKSGKDLTG 
LKGRLIEVTEEELKKHNKKDDCWICIRGFVYNVSPYMEYHPGGE 
DELMRAAGSDGTELFDQVHRWVNYESMLKECLVGRMAIKPAVLK 
DYREEEKKVLNGMLPKSQVTDTLAKEGPSYPSYDWFQTDSLVTI 
/EHIY*TEGYQFRLNNS*SSE*FLYSRWNY*GLLISYTYW/R*A 
MRFRKI FLCX3L/CES VGKI EI VLQKKENTSWDFLGHPLKNHNSL 
I PRKDTGL YYRKCQLISKEDVTHDTRLFCLMLPPS THLQ VP I GQ 
HVYLKLPITGTEIVKPYTPVSGSLLSEFKEPVLPNNKYIYFLIK 
IYPTGLFTPELDRLQIGDFVSVSSPEGNFKISKFQELEDLFLLA 
AGTGFTPMVKILNYALTDI PSLRECVKLMFFNKTEDDI IWRSQLE 
KLAFKDKRLDVEFVLSAPISEWNGKQGHISPALLSEFLKRNLDK 
S KVLVC I CGFVPFTEQGVR LLHDLNFSKNE IHS FTA 


6948 


104 


58 


PDGAHS FF PDE YFTCS SLCLS CG VGCKKS MNHGKEG VP HEAKS R 
CRYSHQYDNRVYTCKACYERGEEVSWPKTSASTDSPWMGLAKY 
AWSGYVIECPNCGWYRSRQYWFGNQDPVDTWRTEIVHVWPGT 
DGFLKDNNNAAQRLLDGMNFMAQSVSBIjSLGPTKAVTSWLTDQI 
APAYWRPNSQILSCNKCATSFKDNDTKHHCRACGEGFCDSCSSK 
TRPVPERGWGPAPVRVCDNCYEAR/TRPVSCYRGTSGR*RRRRT 
QETVE 


6949 


152 


4656 


GLRLCLSRPLTRPGDDSVGGSAMAStiAGGVG6GGGGKIRTRRCH 
QGPIKPYQQGRQQHQGILSRVTESVKNIVPGWLQRYFNKNEDVC 
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NO: 
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beginning 
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to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, Hs 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X«Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 








SCSTDTSEVPRWPENKEDHLVYADEESSNITDGRITPEPAVSNT " 
EEPSTTSTAST\YPDVLTRVSLYRSHLNFSMLESPALHCQPSTS 
SAFP IGSSGFSLVKE IKDSTSQHDDDNISTTSGFSSRASDKDIT 
VSKNTSLP PLWS PEAERSHSLSQHTATSSKKPAFNLSAFGTLS P 
SLGNSSILKTSQLGDSPFYPGKTTYGGAAAAVRQSKLRNTPYQA 
PVRRQMKAKQLSAQSYGVTSSTARRILQSLEKMSSPLADAKRIP 
SIVSSPLNSP LDRSG I D I TD FQAKRE KVDSQ YP P VQRLMT P KPV 
S I ATNRS VYFKPS LTPS GE FRKTNQR I DKKCS TG YE KNMT PGQN 
REQRESGFSYPNFSLPAANGLSSGVGGGGGKMRRERHAFVASKP 
LEEEEMEGPVLPEaSLPITSSSLPTFNFSSPEITTSSPSPINSS 
QALTNKVQMTS PSSTGS PM FKPSS P I VKSTEANVLPPS S IGFTF 
SVPVAKTAELSGSSSTLEPIISSSAHHVTTVNSTNCKKTPPEDC 
EGPFRPAE I LKEGSVLD ILKS PGFAS P KIDS VAAQPTATS P WY 
TRPAI SSFS SSG IGFGES LKAGSS WQCOTCLLQtf KVTDNKC I AC 
QAAKLS PRDTAKQTGIET PNKSGKTTLSASGTGFGDKFKPVIGT 
WDCDTCLVQNKP EAI KCVACETPKPGTCVKRALTLTWS ESAET 
MTASSSSCTVTTGTLGFGDKFKRPIGSWECSVCCVSNNAEDNKC 
VSCMSEKPGSSVPTSSSSTVPVSLPSGGSLGLEKFKKPEGIWDC 
ELCLVQNKADSTKCLACESAKPGTKSG FKGFDTSSS SSNS AAS S 
S FKFGVS SS S SG P S QTLTS TGNFKFGDQGGFKI G VS S DSG Y I NP 
MS EG F * FS KH I VGF KFGVS S E S KPEE VKKDS KNDNFKFGLS FGL 
SNPVFLTPFQFGVSNLGQEEKKEELLKSSCAGFRFGTGVINSTR 
VPANT1VTSENKSS FNLGTI ETKSVS VAPLKCQTSEAKKEEMPA 
TKGGFSFGNVEPASLPSASVFVLGRTEEKQQEPVTSTSLVFGEG 
KLTMKEPKC\QPVFSFGEFQRQTKDENSS KSTFS FSMTKPSEKE 
S EQP AKATFAFG AQTNTTADQGAAKP DLS YLNNS S SSSSTPATS 
AGGG \ I FGSSTS S SNPPVATFVFGQS SNPGS SS \ AFGNTAES S T 
SQSLL FSQDSKLATTS STGTAVTPFVFGPGASSNNTTTSGFGFG 
ATTTSSSAGSSFVFGTGPSAPSASPAFGANQTPTFGQSQGASQP 
NPPGFGSISSSTALFPTGSQPAPPTFGTVSSSSQPPVFGQQPSQ 
S AFGS GTTPNSS S AFQFGS S TTNFNFTNNS P SG VFTFGANS S T P 
AAS AQ PS G S GG FP FNQ S PAAFTVG SNG KNVFSS SGTS FSGRK I K 
TAVRRRK 


6950 


2585 


411 


PRPGSRSGLCRRAGERGAVRAGGLSRRTRAE* IMDELHYQDTDS " 

D VPEQRDS KCKVKWTHE EDEQLRALVRQ FGQQDW KFLASHF PNR 

TDQQCQYRWLRVLNPDLVKGPWTKEEDQKVIEIiVKKYGTKQWTL 

IAKHLKGRLGKQCRERWHNHLNPEVKKSCWTEEEDRI ICEAHKV 

LGNRWAE IAKMLPGRTDNAVKNHWNST I KRKVDTGGFLSESKDC 

KPPVYLLLELEDKDGLQSAQPTEGQGSLLTWWPSVPPTIKEEEN 

S EEELAAATTSKEQE P IGTDLDAVRTPBP LEE FPKREDQEGS PP 

ETSLPYKWVVEAANLLIPAVGSSLSEALDLIESDPDAWCDLSKF 

DLPEEPSAEDSINNSLVQLQASHQQQVLPPRQPSA\LVPSVTEY 

RLDGHTISDLSRSSRGELIPISPSTEVGGSGIGTPPSVLKRQRK 

RRVALSPVTENSTSLSFLDSCNSLTPKSTPVKTLPFSPSQFLNF 

WNKQDTLELES PSLTSTPVCSQKVVVTTPLHRDKTPLHQKHAAF 

VT PDQKYSMDNTPHT PT P FKNALE KYGPLKPL PQTPHLEEDLKE 

VLRSEAGIELIIEDDIRPEKQKRKPGLRRSPIKKVRKSLALDIV 

DEDMKLMMSTLPKSLSLPTTAPSNSSSLTLSGIKEDNSLLNQGF 

LQAKPEKAAVAQKPRSHFTTPAPMSSAWKTVACGGTRDQLFMQE 

KARQLLGRLKPSHTSRTLILS 


6951 


1940 


239 


AGPDDTMKRS LQAL YCQLLSFLL I LALTEALAFAIQE PS PRESL 
QVLPSGTPPGTMVTAPHSSTRHTSWMLTPNPDGPPSQAAAPMA 
TPTPRAEGHPPT\TPSPPSLRQ* PPPILKAP/ SSTGPAPAAMAT 
TS S KPEGRPRGQ AAPT I LLT KPPGATS RPTTAP PRTTTRR P P RP 
PGSS RKGAGNS SRPVP PAPGGHSRS KEGQRGRNPSSTPLGQKRP 
LGKI FQI YKGUFTGS VE PEPSTLTPRTPLWGYS SS PQPQTVAAT 
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Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
WsTryptophan, Y=*Tyrosine, X«Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\opossible nucleotide insertion) 








TVPSNTSWAPTTTSLGPAKDKPGLRRAAQGGGSTFTSQGGTPDA 
TAASGAPVSP/PSCPSAFSAPPPR*PTGWPQP**LLAYCYP\CT 
SRPLSTS SGVFTAATG PTPAAFDTS VSAPS QG1 PQGAS TTPQAP 
THPSRVSESTISGAKEETVA\PSP*PTGCPVLSPQWYPQPQAIS 
STAWSPPGPGSLGQQGTSPMWPRGTNRSTEPPSA*ARWISPG*S 
WPSACPSPP\LCPADGVLHEEEEEDRQPGEQPEAYGNNTHHPGT 
TFQQAC\RGAAPGEIPVPLKPLRTQLSEPRSPANGDYRDTGMVP 
C 


6952 


658 


304 


PESEGESGEMTDRYTIHSQLEHLQSKYIGT\ATPTPPSGSG\CE 
PTPRLVLLLHGPLRPSQLLRHCGE * EQS AS P LQ LDGKDAS ALWT 
ASRQARGELRLCLTTAVRGTSPSVSPVCQSS 


6953 


1512 


349 


NWG KTRAIAS GKHVP FGKQTNPNKS / VHCDS * G * * RRETTQDE S 
FS PHFRGKMGGW\ KLEKELENTEQPVGGNEG * EHEVTGNLNSD 
PLLELCQCPLCQLDCGSREQL I AHVYQHTAAWSAKS YM \CPVC 
GRALSSPGSLGRHLLIHSEDQRSNCAVCGARFTSHATFNSEKLP 
EVLNMESLPTVHNEGPSSAEGKD I AFSPPVY PAGILLVCNNCAA 
YRKLLEAQTPSVRKWALRRQNEPLEVRLQRLERERTAKKSRRDN 
ETPEEREVRRMRDREAKRLQRMQETDEQRARRLQRDREAMRLKR 
AI ET PE KRQARL I REREAKRLKRR LE KMDMMLRAQFGQD PS AMA 
ALAAEMNFFQLP VSGVELDSQLLGKMAFEEQNS S SLH 


6954 


819 


1 


PPPPFIIPSHPREAGT*AG*KRSGDSECSPPVEQ*A*TRAAAQN 
* PQR* RWTEGNS PQAS AVAT PGQGAS PAAPRCTP * PSRRHRRLP 
PGARPPAG*AAPAPTKPWLAGPASAPQPGAAPLSPPAPPLIRTR 
♦CAGAAARGRPRRDRS PRPRTPGGCSWSEPRT PPAVSAS AQTPS 
DAG*AGGR*GQRQRPSTGR* PPGVGGAGRSHRREGTI PGNPHPR 
AS*RAGWQR*PGP/REWGI>*EPQGEEMSGPGGPGGAPPNQVGSS 
VMQAMSTGI 


6955 


1968 


782 


PPGRRQVRAQVAGAPVGHWGTRARQVKTGGRRRARRTMPFLGQD 
WRS PGWS WIKTEDGWKRCES CSQKLERENNHCNI SHS 1 1 LNS ED 
GE I FNNEEHE YAS KKRKKDHFRNDTNTQS FYREKWI YVHKE STK 
ERHG YCTLGEAFNR LDFS S AIQDIRR FN YWKLLQL I AKS Q LT S 
LSGVAQKNYFNI LDKIVQKVLDDHHNPRL IKDLLQDLSSTLC I Xj 
/N*RSREVCISGKHQYLDLPIRNYSRLATTATGSSDD*ASE\NG 
LTLSDLPLHMLNNILYRFSDGWDIITLGQVTPTLYMLSEDRQLW 
KKLCQYHFAEKQFCRHLILSEKGHIEWKLMYFALQKHYPAKEQY 
GDTLHFCRHCS I L FWKDS GHPCTAAD PDSCFTP VS PQHF I DLFK 
F 


6956 


B605 


3839 


QTSTS I FAS PTS PPVLGESVLQDNSFDLNNGSDAEQEEMETQSS 
DFPPSLTQPAPDQSSTIQLHPATSPAVSPTTSPAVSLWSPAAS 
PE I S PE VCPAAST WS PAVFS WS PASSAVLPAVSLE VPLTAS V 
TS PKAS P VT S PAAAF PTAS PANKD VSS FLETTADVEE ITGEGLT 
ASGSGDVMRRRIATPEEVRLPLQHGWRREVRIKKGSHRWQGETW 
YYGPCGKRMKQFPEVIKYLSRNWHSVRREHFSFSPRMPVGDFF 
EERDTPEGLQWVQLSAEEIPSRIQAITGKRGRPRNTEKARTKEV 
P KVKRGRGR P PKVK I TELLN KTDNR PLKKLEAQETLNEEDKAKI 
AKSKKKMRQKVQRGECQTTIQGQARNKRKQETKSLKQKEAKKKS 
KAEKEKGKTKQEKLKEXVKRBKKEKVKWKEKEEVTKAKPACKAD 
KTLATQRRLEERQRQQMILEEMKKPTEDMCLTDHQPLPDFSRVP 
GLTLPSGAFSDCLTIVEFLHSFGKVLGFDPAKDVPSLGVLQEGL 
LCQGDSLGEVQDLLVRLLKAALHDPGFPSYCQSLK I LGEKVSE I 
PLTRDNVSE I LRCFLMAYGVEPAL CDRLRTQPFQAQP PQQKAAV 
LAFLVHELNGSTLI INE IDKTLESMS S YRKNKWI VEGRLRRLKT 
VLAKRTGRSEVEMEGPEECLGRRRSSRIMEVTSGMEEEEEEESI 
AAVPGRRGRRDGEVDATASS I PELERQIEKLS KRQLFFRKKLLH 
SSQMLRAVSLGQDRYRRRYWVLPYLAGIFVEGTEGNLVPEEVIK 
KETDSLKVAAHASLNPALFSMKMELAGSNTTASS PARARGRPRK 
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Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Iieucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TKPGSMQPRHLKSPVRGQDSEQPQAQLQPEAQLHAPAQPQPQLQ 
LQLQ SHKGFLEQEG S PLS LGQ S QHDLS QS AFLSWLSQTQSHS S L 
LSSSVLTPDSSPGKLDPAPSQPPEEPEPDEAESSPDPQALWFNI 
SAQMPCNAAPTPPPAVSEDQPTPSPQQLASSKPMNRPSA^PCS 
PVQFSSTPLAGLAPKRRAGDPGEMPQSPTGLGQPKRRGRPPSKF 
FKQMEQRYLTQLTAQPVPPEMCSGWWWIRDPEMLDAMLJCALHPR 
GI RE KALH KHLNKHRD FLQEVCLR P S ADP I F3PRQLP AFQEG I M 
SWSPKEKTYETDLAVLQWVEELEQRVIMSDLQIRGWTCPSPDST 
REDLAYCEHLSDSQEDITWRGRGREGLAPQRKTTNPLDLAVMRL 
AALEQNVERRYLRE P LWPTHEWLE KALLSTPNGAPEGTTTE I S 
YE ITPRIRWRQTLERCRSAAQVCLCLGQLERS IAWEKS VNKVT 
CLVCRKGDNDE FLL L CDG C DRG CH I YCHR PKMEAVPEGDWFCT V 
CLAQQVEG EFTQ KPG F PKRGQKR KS GYS LNFS EGDGRRRR VLLR 
GRES PAAG PRYS EEGLS PSKRRRLS MRNHHS DLTFCE 1 1 LMEME 
SHDAAWPFLE PVNPRLVSG YRR 1 1 KNPMDFS TMRERLLRGGYTS 
SEEFAADALLVFDNCQTFNEDDS E VGKAGHI MRRFFE \SRWEE F 
YQGKQGQS VRQGRWG VTL WHL P PT F QTKT CHFHLLML P W VQTQ V 
RYWPDF 


6957 


82 


3514 


HLI VAMPE PTKKEENE VPAPAP P PE EPSKEKEAGTTPAKDWTLV 
ETPPGEEQAKQNANSQLSILFIEKPQGGTVKVGEDITFIAKVKA 
EDLSEKPTINGSRKWMDLASKAGKHLQLKETFERHSRVYTFEMQ 
1 1 KAKDNFAGNYRCEVTYKDKFDS CS FDLBVHESTGTTPN I DIR 
SAFKRSGEGQEDAGELDFSGLLKRREVKQQEEEPQVDVWELLKN 
TKPSE YEKIAFQ YESPTCSGMLKRLKRS I REEKKSAAFAKI LD P 
VYQVDKGGRVRFWELADPKLEVKWNKNGQELRPSTKYI FEDTR 
CQS ILNIDNCQMTDDSE YYVTAGDE KCSTELLVREPP IMVTKQL 
EDTTD YCGERVE LECEVS EDDAQVKWFKNGEE 1 1 LVQTR YRI RV 
EGKKHILIIBGATKADAADYSVMTTGGQSSAKLSVDLKPLKILT 
PLTDQTVNLGKE ICLKCE ISENI PGKWTKNGLPVQESDRLKWH 
KGR I H KLVIDHALTEDEGD YVFAPDAYNVTLPAKVHV I DP P KI I 
LDGLDADNTVTV I AGNKLRLE I P I SGE PPPKAMWSRGDKAIMEG 
SGRIRTESYPDSSTLVIDIAERDDSGVYHINLKNEAGEAHASIK 
VKWDFPDPPVAPTVTEVGDDWCIMNWEPPAYDGGSPILGYFIE 
RKKKQSSRWMRLNFDLCKETTFEPKKM IEGVAYEVRIFAVNA\ I 
GI S KPSMPSRPFVPLAVTSP PTLLTVDS VTDTTVTMRWRPPDHI 
GAAGLDGYVLEYCFEGSTSAKQSDENGEAAYDLPAEDWIVANKD 
LIDKTKFTITGLPTDAKIFVRVKAVNAAGASEPKYYSQPIIjVKE 
IIEPPKIHSP KHLKQTYI RRVGDRV I LVI P FQGKPR PELTWKKD 
GAEIDKNQINIRNSETDTIIFIRKAERSHSGKYDLQVKVDKFVE 
TAS ID IRI IDRPGPPQIVKIEDVWGRNVALTWTPPKDDGNAAIT 
GYTIQKADKKSMEWLRVIEHIIEPVPHTELVIGNEYYFRVFSEN 
MCGLSEDATMTKESAVTARDGKIYKNPVYEDFDFSEAPMFTQPL 
VNRLCHSGYMATLNCSVRGNPKPKITWMKNKVAIVDDPRYRMFS 
NQGVCTLEIRKPSPYDGGTYCCKAVNDLGTVEIECKLEVKVIAQ 






1663 


PRTSRVKTEGSQGSSAMDFSVKVDIEKEVTCPICLELLTEPLSL 
DCGHSFCQACITAKIKESVIISRGESSCPVCQTRFQPGNLRPNR 
H LAN1 VERVKE VKMS PQEGQ KRDVCE HHGKKLQ I FCKEDGKVI C 
WVCELSQEHQGHQTFRINEWKECQEKLQVALQRLIKENQEAEK 
LEDDIRQERTAWKNYIQIBRQKILKGFNEMRVILDNEEQRELQK 
LEEGEVNVLDNLAAATDQLVQQRQDASTLISDLQRRLRGSSVEM 
LQDVI D VMKRSES WT LKKPKS VSKKL KS VFR VPDLS GMLQ VLKE 
LTDVQYYWVDVMLNPGSATSNVAISVDQRQVKTVRTCTFKNSNP 
CDFSAFGVFGCQYFSSGKYYWEVDVSGKIAWILGVHSKISSLNK 
RKSSGFAFDPSVNYSKVYSRYRPQYGYWVIGLQNTCEYNAFEDS 
SSSDPKVLTLFMAV\LPWLGFS 


6959 


1 


1469 


SLVHWEFGRGIEDFPYLFFQLTHCQQRICSVTQAGVQWCDHSS 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine f V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








LQPQTPGLNQSSHLSLLSSRDYRMLSSFNEWFWQDRFWLPPNVT 
WTE LE DRDGRVYPH PQDIiLAAL P LALVLLAMRLAFERFIGL PLS 
RWLGVRDQTRRQ VKPNATLE KHFLTEGHR PKEPQLSLLAAQCGL 
TLQQTQRW FRRRRNQDRPQLTKKF CEAS WRFLFYLS S FVGGLS V 
LYHESWLWAPVMCWDRYPNQLTLSCPAADSEA\SLYWWYLLELG 
FYLSLLIRLPFDVKRKGGGPSSIKPRPHYDPPSTA\DFKEQVIH 
HFVAVILMTFSYSANLLRIGSLVLLLHDSSDYLLEACKMVNYMQ 
YQQVCDALFL I FS FVF FYTRLVL F PTQ I LYTTYYES I SNRGPFF 
GYYFFNGLLMLI&LLHVFWSCLILRMLYSFMKKGQMEKDIRSDV 
E E SD S SEE AAAAQE P LQLKNGTAGGPR PAPTDGPRS RVAGRLTO 
RHTTAT 


6960 


387 


2068 


AKWARE KE MQEF \TRS F F \RGRPD LS T LTHS I VRRR YLAHSGRS 

HLEPEEKQALKRLVEEEPLKMQVDEAASREDKLDLTKKGKRPPT 

PCSDPERKRFRFNSE3ESGSEASSPDYFGPPAKNGVASRSHTHP 

KEENPRRA\SKAVEESSDEERQRDLPAQRGEESSEEEEKGYKGK 

TRKKPWKKQAPGKASVSRKQAREESEBSEAEPVQRTAKKVEGN . 

KGTKSLKESEQESEEEILAQKKEQREEEVEEEEKEEDEEKGDWK 

PRTRSNGRRKSAREERSCKQKSQAKRLLGDSDSEEEQKEAASSG 

DDS GRDRE P P VQRKS EDRTQLKGGKRLS GS S EDEED SGKGE PTA 

KGSRKMARLGSTSGEESDLEREVSDSEAGGGPQGERKNRSSKKS 

SRKGRTRSSSSSSDGSPEAK33GKAGSGRRGEDHPAVMRLKRYIR 

ACG AHRN YKKLLGS CCSHKERL S I LRAELEALGMKGT PS LGKCR 

ALKEQREEAAEVASLDVANIISGSGRPRRRTAWNPLGEAAPPGE 

LYRRTLDSDEERPRPAPPDWSHMRGIISSDGESN 


6961 


340 


1646 


R P WS S PTMKPNFS LRLR I FNLNCWG I P YLS KHRADRMRRLGDFL 
NQESFDLALLEEVWSEQDFQYLRQKLSPTYPAAHHFRSGIIGSG 
LCVFSKHPIQELTQHIYTLNGYPYMIHHGDWFSGKAVGLLVLHL 
S GMVLNAYVTHLHAE YNRQKDI YLAHRVAQAWELAQ F IHHTSKK 
ADWLLCGDLNMHPEDLGCCLLKEWTGLHDAYLETRDFKGSEEG 
NTMVPKNCYVSQQELKPFPFGVRIDYVLYKAVSGFYISCKSFET 
TTGFDPHRGTPLSDHEALMATLFVRHSPPQQNPSSTHGP\AERS 
PL/MCVCLKEALDGSLGLGMA\QARWWA\TFA\SYVIGLGL\LL 
LALLCVLAAGGGAGEAAI LLWT P S VGL VLWAG AF YLFHVQ EVNG 
LYRAQAELQHVLGRAREAQDLGPEPQLYALL\LGQQEGDRTKEQ 


6962 


340 


1646 


RP WS S PTMKPN FSLRLR I FNLNCWG I P YLS KHRADRMRRLGDFL 
NQESFDLALLEEVWSEQDFQYLRQKLSPTYPAAHHFRSGIIGSG 
LCVFSKHPIQELTQHIYTLNGYPYMIHHGDWFSGKAVGLLVLHL 
SGMVLNAYVTHLHAE YNRQKDI YLAHRVAQAWELAQF I HHTS KK 
ADWLLCGDLNMHPEDLGCCLLKEWTGLHDAYLETRDFKGSEEG 
NTMVPKNCYVSQQELKPFPFGVRIDYVLYKAVSGFYISCKSFET 
TTGFDPHRGTPLSDHEALMATLFVRHSPPQQNPSS THGP \AERS 
PL/MCVCLKEALDGSLGLGMA\QARWWA\TFA\SYVIGLGL\LL 
LALLCVLAAGGGAGEAAI LL WTPS VGLVLWAGAF YLFHVQE VNG 
LYRAQAELQHVLGRAREAQDLGPEPQLYALL\LGQQEGDRTKEQ 


6963 


374 


2618 


RVT PL I LKLLKKP KTAENQKAS EENE I TQPGG S S AKPGL P CLNF 
EAVLS P DP AL I HS THSLTNS HAHTGS SDCD I S CKGMTER I H S I N 
LHN FSNS VLETLNEQRNRGH FCDVTVR I HGS MLRAQRCVLAAGS 
PFFQDKLLLGYSDIEIPSWSVQSVQKLIDFMYSGVLRVSQSEA 
LQILTAASILQIKTVIDECTRIVSQNVGDVFPGIQDSGQDTPRG 
TPESGTSGQSSDTESGYLQSHPQHSVDRIYSALYACSMQNGSGE 
RSFYSGAWSHHETALGLPRDHHMEDPSWITRIHERSQQMERYL 
STTPETTHCRKQPRPVRIQTLVGNIHIKQEMEDDYDYYGQQRVQ 
ILERNESEECTEDTDQAEGTESEPKGESFDSGVSSSIGTEPDSV 
EQQFGPGAARDSQAE PTQPEQAAEAPAEGGPQTNQLETGASSPE 
RSNEVEMDSTVITVSNSSDKSVLQQPSVNTSIGQPLPSTQLYLR 
QTETLTSNLRMPLTLTSNTQVIGTAGNTYLPALFTTQPAGSGPK 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F=Phenylalanine, G=Glycine f 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine f R=Arginine f 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PFLFSLPQPLAGQQTQFVTVSQPGLSTFTAQLPAPQPLASSAGH 
STASGQGEKKPYECTLCNKTFTAKQNYVKHMFVHTGEKPHQCSI 
CWRSFSLKDYLIK\HMVTHTGVRAYQCSICNKRFTQKSSLNVHM 
RLHRGEKSYECYICKKKFSHKTLLERHVALHSASNGTPPAGTPP 
GARAG P PG WACTEGTT Y VCS VC P AKFDQ I EQ FNDHMRMHVS DG 


6964 


1 


178 ' 


SGRPFFFFFSNTDVYFIKKVTNRWTAGSSYKMTRMKSIGKILLL 
QIFIG\NCSMFVLVI 


6965 


757 


208 


NVF I E P R I QG FM KTS AHPGQ KH PD FS MGLL FP LLAALE VCS CGS 
SGSLGYNLPQWH\GLLGRNTLVLLGQMRRISPFLCLKDRSDFRF 
PQEKVEVSQLQKA\QAMSFLYDVLQQVFNFSHKALL\CCMEHDL 
PGPTPHFTS S AAGTPGDLLGAGDGRRRS WGQWV I EGSTLALRRY 
FQBSISTLE 


6966 


820 


1867 


IITALGVRGMPGCPCPGCGMAGPRLLFLTALALELLGRAGGSQP 
ALRSRGTATACRLDNKESESWGALLSGERLDTWICSLLGSLMVG 
LSGVF PLLV I PLEMGTMLRS EAGAWRLKQLLS FALGGLLGNVFL 
HLLPEAWAYTCSASPGGEGQSLQQQQQLGLWVIAGILTFLALEK 
/HVPGQQGGGDQ PGPQQR PHCCCRRAQWRP LSGPAGCRAR PRCR 
GP \D I KVSGYLNLLANTI DNFTHGLAVAASFLVS KKIGLLTTMA 
I LLHE I PHEVGD FAI LLRAG FDRWS AAKLQLS TALGGLLGAG FA 
ICTQSPKGVEETAAWVLPFTSGGFLYIALVNVLPDLLEEEDPW 


6967 


162 


633 


GFLPFKYWILDLSASSRMETDCNPMELSSMSGFEEGSELNGFEG 
TDMKDMRLEAEAWNDVLFAVNNMFVS KSLRCADDVAY INVETK 
ERNRYCLELTEAGLKWGYAFDQVDDHLQTPYHETVYSLLDTL\ 
SPAYREAFGKR\LLQRLEALKRDGQS 


6968 


1 


2265 


RGGGGGRGGPGARERERPGEPERTMEAAAGGRGCFQPHPGLQKT 
LEQ FHLSSMSS LGGPAAFSARWAQEAYKKESAKE AGAAAVPAP V 
PAATEPPPVLHLPAIQPPPPVLPGPFFMPSDRSTERCETVLEGE 
TISCFWGGEKRLCLPQILNSVLRDFSLQQINAVCDELHIYCSR 
CTADQ LEI LKVMG I LP FS AP S CGL I T KTDAERL CNALL YGGAYP 
PPCKKELAASLALGLELSERSVRVYHE\CFGKCKGL\LVPELYS 
S PSAAC IQCLD\ CRLMYP PHKFWHSHKALENRTCHWGF \ DSA\ 
NWRAYILLSQDYTGKEEQARLGR\CLDDVKEKFDYGNKYKRRVP 
RVSSEPPASIRPKTDDTSSQSPAPSEKDKPSSWLRTLAGSSNKS 
LGCVHPRQRLSAFRPWS PAVSASBKELSPHLPAL I RDS F YS YKS 
FETAVAPNVALAPPAQQKWSSPPCAAAVSRAPEPLATCTQPRK 
RKLTVDTPGAPETLAPVAAPEEDKDSEAEVEVESREEFTSSLSS 
LSSPSFTSSSSAKDLGSPGARALPSAVPDAAAPADAPSGLEAEL 
EHLRQALEGGLDTKEAKE KFLHEWKMRVKQEE KLSAALQAKRS 
LHQELEFLRVAKKEKLREATEAKRNLRKEIERJjRAENEKKMKEA 
NES RLRLKRB LE QARQAR VCDKGCEAGRLRAK YS AQ I E D LQ VKL 
QHAEADREQLRADLLREREAREHLEK\WK\ELQEQLWPRARPE 
AAGS EG \ AAELE P 


6969 


1855 


118 


AGTMHGRLKVKTSEEQAEAKRLEREQKIoKLYQSATQAVFQKRQA 
GELDESVLELTSQILGANPDFATLWNCRREVLQQLETQKSPEEL 
AALVKAELGFLESCLRVNPKSYGTWHHRCWLLGRLPEPNWTREL 
ELCARFLEVDERNFHCWDYRRFVATQAAVPPAEELAFTDSLITR 
NF SNYS S WH YRS CLL PQLH PQPDS GPQGRL PED VLLKELELVQN 
AFFTDPNDQSAWFYHRWLLGRADPQDALRCLHVSRDEACLTVSF 
SR PLLVGSRME I LLLMVDDS PL IVEWRTPDGRNRPSHVWLCDLP 
AASLNDQLPQHTFRVIWTAGDVQKECVLLKGRQEGWCRDSTTDE 
QLFRCELSVEKSTVLQSELESCKELQELEPENKWCL\LTIILLM 
RALDPLLYEKETLQYFQTLK\AWDPKRATY\LDDLRSKFLLENS 
VLKMEYAEVRVLHLAHKDLTVLCHLEQLLLVTHLDLSHNRLRTL 
PPALAALRCLEDPPPRT\VLQASDNAIESLDGVTNLPRLQELLL 
CNNRLQQPAVLQ PIAS CPRLVLLNLQGNPLCQAVG I LEQLAELL 
PSVSSVLT 
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Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y-Tyroeine, X -Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6970 


3 


1528 


SFPPLLSSPSAVGEGKVAVAAPCPGRSECARAKMAYIQLBPLNE 
G FLS R I SG L LLCRWT CRHCCQKC YE S S CCQS S EDEVB I LG PF PA 
QTPPWLMASRSSDKDGDSVHTASEVPLTPRTNSPDGRRSSSDTS 
KSTYSLTRRISSLESRRPSSPLIDIKPIE FGVLS AKKEP I QPS V 
LR R T YNPDD YFR KFE PHL YSLDSNS DDVDSL TDEE I LS K YQLGM 
LHFSTQYDLLHNHLTVRVIEARDLPPPISHDGSRQDMAHSNPYV 
KICLLPDQKNS KQTGVKRKTQKPVFEERYTFE I P FLE AQRRTLL 
LTWDFDKFSRHCVIGKVSVPLCEVDLVKGGHWWKALIPSSQNE 
VELGELLLSLNYLPSAGRLJnyDVIRAKQLLQTDVSQGSDPFVKI 
QL VHGLKLVKT KKTS FLRGTI D P F YNESFS F KVPQEE LENASLV 
FTVFGHNMKSSNDFIGRIVIG\QYSSGP\SEPNHWRRMLNTHRT 
AVEQWHSLRSRAECDRVSPASLEVT 


6971 


37 


3702 


ACF YVPGSRS FKIj I PRHGL VNMGRSG KLP SG VS AKLKRWKKGHS 
SDSNPAICRHRQAARSRFFSRPSGRSDLTVDAVKLHNELQSGSL 
RLGKSEAPETPMEEEAELVLTEKS SGTFLSGLSDCTNVTFS KVQ 
RFWESNSAAHKE I CAVLAAVTEVI RSQGGKETETE YFAAL I RKA 
AQHG VC S VLKGS E FMFE KAP AHHP AAI STAKFCIQE I EKS GGS K 
EATTTLHMLTLL KDLL P C F PEGL VKS CSE TLLRVMTL SHVLVTA 
CAMQAFHSLFHARPGLS TLS AELNAQ I ITALYD YVPSENDLQPL 
LAWLKVMEKAHINLVRLQWDLGLGHLPRFFGTAVTCLLSPHSQV 
LTAATQSLKEILKECVAPHMADIGSVTSSASGPAQSVAKMFRAV 
EEGL T YKFHAAWS S VLQLL CVF FEACGRQAH P VMRKCLQ S LCDL 
RLSPHFPHTAALDQAVGAAVTSMGPEWLQAVPLEIDGSEETLD 
FPRSWLLPVIRDHVQETRLGFFTTYFLPLANTLKSKAMDLAQAG 
STVE S KI YDTLQWQMWTLL PGFCTRPTDVAI S FKGLARTLGMAI 
SERPDLRVTVCQALRTL ITKGCQAEADRAEVSRFAKNFLP I LFN 
LYGQ PVAAGDTPAPRRA VLETI RT YLTI TDTQLVNS LLE KAS EK 
VLDPASSDFTRLSVLDLWALAPCADEAAISKLYSTIRPYLESK 
AHGVQKKAYRVLEE VCAS PQGPGALFVQSHLEDLKKTLLDSLRS 
TSSPAKRPRLKCLBHIVRKLSAEHKEFITALIPEVILCTKEVSV 
GARKNAFALLVEMGHAFLRFGSNQEEALQCYLVLIYPGLVGAVT 
MVSCS I LTVLTHLLFEFKGLMGTSTVEQLLENVCLLIiASRTRDVV 
KSALGF I KVAVTVMDVAHIjAKHVQLVMEAIGKLSDDMRRHFRMK 
LRNLFT\KFIPK\FGILTWGKKAVGPKEYHRVLVNIRKAEARAK 
RHRALSQAAVEEEEEEEEEEEPAQGKGDSIEEILADSEDEEDNE 
EEERSRGKEQRKLARQRSRAWLKEGGGDEPLNFLDPKVAQRVLA 
TQPGPGRGRKKDHS FKVSADGRLI IREEADGNKMBEEEGAKGED 
EEMADPMEDVI IRNKKHQKLKHQKEAEEEELE I PPQ YQAGGSGI 
HRPVAKKAMPGAE Y KAKKAKGDVKKKGRPDP YAY I P LNR S KLNR 
RKKMKLQGQFKGLVKAAQRGSQVGHKNRRKDRRP 


6972 


2179 


973 


PGGAILLPLWRRTRPREATVPRGAAQRGRARSAEGRIPSSQSPS 
PAEAGGATRSPPPRPPRPARPPGPSAPPLLRSDAGPGATVSAAA 
AAATERARRGATMGAQLSTLGHMVLFPVWFLYSLLMKLFQRSTP 
A I TLBS PD I KYPLRL I DRE 1 1 SHDTRRFRFALPS PQHI LGLP VG 
QHI YLS AR I DGNL WRP YT P I S S DDD KG FVDLV I KVYF KDTH P K 
FPAGGKMSQYLESMQIGDTIEFRGPSGLLVYQGKGKFAIRPDKK 
SNPIIRTVKSVGMIAGGTGITPMLQVIRAIMKDPDDHTVCHLLF 
ANQTEXDI LLRPELEELRNKHSARFiGLWYTLDRAPEAWDYGQG \ 
FVNEEMIRDHLPPPE\EEPLVLMCX3PPPMIQYACLPNL\DHVGH 
PTERCFVF 


6973 


1 


1964 


LQPRCAHRGLRAQKCGRPAPGVDAMVLCPVIGKLLHKRWLASA 
S PRRQE ILSNAGLRFE WPS KFKEKLDKAS FATP YG YAMETAKQ 
KAL EVANRL YQKD LRAPDW I GADTI VTVGGLI L E KP VDKQDAY 
RMLSRFE/SGREHSVFTGVAIVHCSSKDHQLDTRVSEFYEETKV 
KFS ELS EELLWE YVHS GE PMD KAGG YG I QALGGMLVES VHGDFL 
NWGFPLNHFCKQLVKLYYPPRPEDLRRSVKHDSIPAADTFBDL 
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Codon, /"possible nucleotide deletion, 
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S DVEGGGS E PTQRDAG SRDE KAEAGEAGQATAEAE CHRTRE TLP 
PFPTRLLELIEGFMLSKGLLTACKLKVFDLLKDEAPQKAADIAS 
KVDASACGMERLLD I CAAMGLLEKTEQGYSNTETANVYLASDGE 
YSIjHGF I MHNNDLT WNLFTYLEFAIREGTNQHHRALGKKAEDLF 
QDAYYQS P ETRLRFMRAMHGMTKLTACQVATAFNLS R FS S ACD V 
GGCTGALARELAREYPRMQVTVFDLPDIIELAAHFQPPGPQAVQ 
I H FAAGDFFRDPLP SAELYVLCR I LHDWPDDKVHKLLSRVAESC 
KPGAGLLLVE TLLDEE KRVAQRALMQS LNMLVQTEGKERSLGE Y 
Q CLLELHGFHQVQ WHLGGVLD A I L\ P PKW P P EAQAACS L 


6974 


3032 


2172 


RS CAAFAS FASRPPLELFAP PGSHRS P PGRGVATS AQCALSVRK 
LLAARPGLGTKYQATMVYKTLFALCILTAGWRVQSLPTSAPLSV 
SLPTNIVPPTTIWTSSPQNTDADTASPSNGTHNNSVLPVTASAP 
TSLLPKNISIESREEEITSPGSNWEGTNTDPSPSGFSSTSGGVH 
LTTTLEEHS LGTPEAGVAATLSQS AAE PPTL I SPQAPAS SPSS L 
STSPPEVFSASVTTNHSSTVTSTQPTGAPTAPESPTEESSSDHT 
PTSHATAEPVPQEKTPPTTVSGKVMCELIDMET\PPPFPG 


6975 


2 


500 


RPRPTVHCCKWALKLETAMETLINVFHAHSGKEGDKYKLSKKEL 
KE LLQTE LSGFLD VKE LML * ATE ALKT FE E A* KSPI IQCSSSRS 
SLPPAPQPPPYL*LSAVPFPIHLPLPLLPPQAQKDVDAVDKVMK 
BLDENGDGEVDFQEYWLVAALTVACNNFFWENS 


6976 


1216 


970 


GCQL*VAYGTTENSPVTFAHFPEDTVEQKAESVGRIMPHTEARI 
MNMEAGTIiAKLNTPGELCIRGYCVMLGYWGEPQKTEEAVDQDKW 
YWTGDVATMNEQGFCKIVGRSKDMIIRGGENIYPAELEDFFHTH 
P KVQEVQ WG VKDDRMGE E I CAC IRLKDGEETTVEE I KAFCKGK 
ISHFKIPKYIVFVTNYPLTISGKIQKFKLREQMERHLNL*IKQQ 
ACPGRLA 


6977 


1298 


588 


SLFINTNLLSNQIRKTSFGMCSEPISDNTEDQKGKLKTPDFA*R 
ANKKS KHHVNGNRT VE PFPEGTQMAVFGMGC FWGAERKFWVLKG 
VYSTQVGFAGGYTSNPTYKEVCSEKTGHAEWRWYQPEHMSFE 
ELLKVFWENHDPTQGMRQGNDHGTQYRSAIYPTSAKQMEAALSS 
KENYQKVLSEHGFGPITTDIREGQTFYYAEDYHQQYLSKNPNGY 
CGLGGTGVSCPVGIXK 


6978 


3 


242 


SFPFRDSRRCGCCKGSSLRHTAVAMVKIjSKEAKQRIjQQLFKGSQ 
FAIRWGFIPLVIYLGFKRGADPGMPEPTVLSLLWG 


6979 


3917 


1146 


DEARVRGEAVAAAILSRCRHWSGPPPFPPSPPDRKGLRGTEPWE 
AGPGSGATPGARAMDVRRLKVNELREELQRRGLDTRGLKTELAE 
RLQAALEAEEPDDERELDADDEPGRPGHINEEVETEGGSELEGT 
AQPP PPGLQPHAEPGGYS GPDGH YAMDNI TRQNQF YDTQVI KQE 
NESGYERRPLEMECX3QAYRPEMKTEMKQGAPTSFLPPEASQLKP 
DRQ Q FQS R KRP YE ENRGRGYFEHRED RRGRS PQP P AEE DEDDFD 
DTLVAIDTYNCDLHFKVARDRS SG YPL TI EG FA YL WS GARAS YG 
VRRGRVCFEMKINEEISVKHLPSTEPDPHWRIGWSLDSCSTQL 
GEE P FS YG YGGTGKKS TNS RFENYGD KFAEND V I GCFADFE CGN 
D VELS FTKNGKWMGIAFR I QKEALGGQAL YPHVL VKNCAVE FNF 
GQRAE PYCSVLPG FTF I QHLPLSERI RGTVGPKS KAECE ILMMV 
GLPAAGKTTWAI KHAASN PS KKYN I LGTNAIMDKMRVMGLRRQR 
NYAGRWD VLI QQATQCLNRLIQIAARKKRNYI LDQTNVYGSAQR 
RKMRPFEGFQRKAIVICPTDEDLKDRTIKRTDEEGKDVPDHAVL 
EMKANFTLPDVGDFLDEVLFIELQREEADKLVRQYNEEGRKAGP 
PPEKRFDNRGGGGFRGRGGGGGFQRYENRGPPGGNRGGFQNRGG 
GSGGGGNYRGGFNRSGGGGYSQNRWGNNNRDNNNSNNRGSYNRA 
PQQQPPPQQPPPPQPPPQQPPPPPSYSPARNPPGASTYNKNSNI 
PGSSANTSTPTVSSYSPPQSFGFFPSTFQPSYSQPPYNQGGYSQ 
GYTAPPPPPPPPPAYNYGSYGGYNPAPYTPPPPPTAQTYPQPSY 
NQYQQYAQQWNQYYQNQGQWPPYYGNYDYGSYSGNTQGGTSTQ 


6980 


1 


420 


GTRGRKTGRVAAPSTRRRTGNMQKLQTRSPAMSLSDPGLGYHPT 
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L=Leucine, M=Methionine, N=Asparagine , 
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S=Serine, T=Threonine , V«Valine, 
W=Tryptophan, Y*Tyrosine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








CWTLRWPPLCSLHALHVFHCLFSSRLGTPVSPRLAMDPNCSCEA 
GGSCACAGSCKCKKCKCTSCKKSCCSCCPLGCAKCAQGCICKGA 
SEKCSCCA 


6981 


10 


1054 


PGRGFRRASLRPAFAARGVFQGGLGQAKQARTRACAALPTPHPS 
APR LLE PQGVFSLF P P PPG P W PNM ILTKAQ YDE I AQ CLVSVPPT 
RQSLRKLKQRFPSQSQATLLSIFSQEYQKHIKRTHAKHHTSEAI 
ESYYQRYLNGWKNGAAPVLLDLANEVDYAPSLMARLILERFLQ 
EHEETPPSKSIINSMLRDPSQIPDGVLANQVYQCIVNDCCYGPL 
VDCIKHAIGHEHEVLLRDLLLEKNLSFLDEDQLRAKGYDKTPDF 
ILQVPVAVEGHI IHW I E5KAS FGDECSHHAYLHDQFWS YWNRFG 
PGLVI YWYGF I QELDCNRERG ILLKACFPTNI VTLCHS I A 


6982 


153 


1285 


FPQQDCSAPAAPGIiAGSEPRRLRAYRRRRQRARGLKRVAWLAPP 
P S LLQGLQG WAQAP VDGTLGPEDS RASS PM I QNS RPSLLQPQDV 
GDTVETLMLHPVIKAFLCGSISGTCSTLLFQPIiDLLKTRLQTLQ 
PSDHGSRRVGMLAVLLKVVRTESLLGLWKGMSPSIVRCVPGVGI 
YFGTLYSLKQYFLRGHPPTALESVMLGVGSRSVAGVCMSPITVI 
KTRYESGKYG YES I YAALRS I YHSEGHRGLFSGLTATLLRDAP F 
SGI YLMFYWQTKNI VPHDQVDATLI PITNFSCGI FAG1LASLVT 
QPADVIKTHMQLYPLKFQWIGQAVTLIFKDYGLRGFFQGGIPRA 
LRRTLMAAMAWTVYEEMMAKMGLKS 


6983 


82 


773 


EMS FLQDPS FFTMGMWS I GAGALGAAALALLLANTDVFLS KPQK 
AALEYLEDIDLKTLEKEPRTFKAKEIiWEKNGAVIMAVRRPGCFL 
CREEAADLSSLKSMLDQLGVPLYAWKEHIRTEVKDFQPYFKGE 
IFLDEKKKFYGPQRRKMMFMGFIRLGVWYNFFRAWNGGFSGNLE 
GEGFrLGGVF WGSGKQG I LLEHREKEFGDKVNLLSVLEAAKMI 
KPQTLASEKK 


6984 


1845 


1282 


GGRSAYSLPAGSLPRVPATAAAKMASGVQVADEVCRIFYDMKVR 
KCSTPEEIKKRKKAVI FCLSADKKCI IVEEGKEILVGDVGVTIT 
D P FKHFVGML P E KDCR YAL YDAS FE TKES RKE ELM FFLWAPELA 
PLKS KM I YAS S KDAI KKKFQG I KHE CQANGPEDLNRAC I AE KLG 
GSLIVAFEGCPV 


6985 


1887 


1324 


RRTAG I YP CF P KPGRTRHALCS VVLLLLTGQ LAF DD FQES CAMM 
WQ KYAGS RRS MP LGAR I L FHGVF YAGGFAI VYYL I Q KFHS RAL Y 
YKIAVEQU3SHPEAQEALGPPLNIHYLKLIDRENFVDIVDAKLK 
I P VSG S KS EGL L YVHS SRGGP FQRWHLD E VFLE LKDGQQ I P VFK 
LSGENGD E VKKE 


6986 


642 


1350 


YHLYFKMGDPNSRKKQALNRLRAQLRKKKESLADQFDFKMYIAF 
VFKEKKKKSALFEVSEVI PVMTNNYEENILKGVRDS S YSLESSL 
ELLQKDWQLHAPRYQSMRRDVIGCTQEMDFILWPRNDIEKIVC 
LLFSRWKESDEPFRPVQAKFEFHHGDYEKQFLHVLSRKDKTGIV 
VNNPNQS VFLFIDRQHLQTPKNKATI FKLCS I CLYLPQEQLTHW 
AVGTI EDHLRP YMPE 


6987 


1623 


341 


LEAAE KAS RA FKE SQRQTDS KNYETENWS PQKSQRR YDM YNTAC 
FLGE I EVGLYT I Q I LQLTPFFHKENELS KKHMVQFLSGKWT I PP 
DPRNECYLALSKFTSHLKNLQSDLKRCFDFFIDYMVLLKMRYTQ 
KEIAEIMLSKKVSRCFRKYTELFCHLDPCIiIiQSKESQLLQEENC 
RKKLEALRADRFAGLLEYLNPNYKDATTMESIVNEYAFLLQQNS 
KKPMTNEKQNSILANIILSCLKPWSKLIQPLTTLKKQLREVLQF 
VGLSHQYPGPYFLACLLFWPENQELDQDSKLIEKYVSSLNRSFR 
GQ YKRMCRS KQAS TLF YLGKRKGLNS I VHKABCI EQ YFD KAQNTN 
SLWHSGDWiCKNBVKDLLRRLTGQAEGKLISVEYGTEEKIKIPV 
ISVYSGPLRSGRNIERVSFYLGFSIEGPPGL 


6988 


3 


689 


TQLLRRPAVFVGSAASGIRSGLWSASSGHWCAPAAGRAHAPVPR 
LVRGLGAAS TAAPQDAQTGPQ PMPRAD C I MRHLPYFCRGQWRG 
FGRGSKQIiGI PTANFPEQWDNLPADI STGI YYGWASVGSGDVH 
KMWS IGWNPYYKNTKKSMETHIMHTFKEDFYGEILNVAI VGYL 
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Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








RPEKNFDSLESLISAIQGDIEEAKKRLELPEHLKIKEDNFFQVS 
KSKIMNGH 


6989 


2 


1118 


LMPSDRPLSPSTHASAGSHCHAPPTTARRAFPIPFG^KSNMATL 
KDQL I YNLLKEEQTPQN KI T WG VGAVGMACAI S I LMKDLADE L 
ALVDVI EDKLKGEMMDLQHGS LFLRTP KI VSGKDYNVTANS KLV 
1 1 TAGARQQEGESRLNLVQRNVNI FKF 1 1 PNWKYS PNCKLL I V 
SNP VD I LT YVAWK I S GF PKNRV I GS GCNLDS ARFR YLMGERLGV 
HPLS CHG WVLGEHGDS S VP VWS GMNVAG VSL KTLHPDLG TDKDK 
EQWKEVHKQWESAYEVIKLKGYTSWAIGLSVADLAESIMKNLR 
RVHPVS TMI KGLYGI KDDVFLS VPCILGQNGISDLVKVTLTS EE 
EARLKKSADTLWGIQKELQF 


6990 


719 


258 


THASGMAS WLALRTRTAVTS LLS PTPATALAVRYAS KKSGGS S 
KNTjGGKS S GRRQG I KKMEGHYVHAGNI IATQRHFRWHPGAHVGV 
GKNKCL YALE EG I VR YT KE V YVPHP RNTEAVDL ITRLP KG AVL Y 
KTFVHWPAKPEGTFKLVAML 


6991 


169 


451 


RRSSDFHNPGFIiSRPVSLRENIHHQVICSTKNKRRNPKKIAYLL 
SSLLMTNLNPNESTENQPVDAYWAFTLDQEFLTYACVEGTGCLF 
CGRHVH 


6992 


944 


510 


RQAPGCSS LALRQVRQVYCGLVRAPQVQTRPLS SRFVE RRGAL Y 
RS PMNQENP P P YPGPGPTAPYPP YP PQPMGPGPMGGP YP P PQGY 
P YQG YPQ YG WQGG PQE P P KTT VYWEDQRRDELGPS TCLTACWT 
ALCCCCLWDMLT 


6993 


1 


374 


QWCVTCPQHNARQGPAVPPGIQAYGAAPFEDLQVDFTEMSKCRG 
DRVWIKNWNVASLCPLWKGPQTWLSPPTAVKVEGIPAWIHHSH 
VKPAARETWEARPS PDNPFRVTLKKTTSPAPVTPGS 


£994 


346 


1100 


QWPEKDPVMAASSISSPWGKHVFKAIIimjVALir^HSAIaAQSR 
RDFAPPGQQKREAPVDVLTQIGRSVRGTLDAWIGPETMHLVSES 
SSQVLWAISSAISVAFFALSGIAAQLLNALGLAGDYLAQGLKLS 
PGQVQTFLLWGAGALWYWLLSLLLGLVIiALLGRILWGLKLVIF 
LAGF VALMRS V PDPS TRALIdjLALL I LYALL SRLTGS RAS GAQL 
EAKVRGLERQVEELRWRQRRAAKGARSVEEE 


6995 


144 


1346 


GSVAVGLSGIMAAQKDLWDAIVIGAGIQGCFTAYHLAKHRKRIL 
LLEQFFTjPHSRGSSHGQSRIIRKAYLEDFYTRMMHECYQIWAQL 
EHEAGTQLHRQTGLLLLGMKENQELKTIQANLSRQRVEHQCLSS 
EELKQRFPNIRLPRGEVGLLDNSGGVIYAYKALRALQDAIRQLG 
GIVRDGEKWEINPGLLVTVKTTSRSYQAKSLVITAGPWTNQLL 
RPLG I EMPLQTLRINVCYWREMVPGS YGVSQAFPCFLWLGLCPH 
H I YGLPTGE Y PGLMKVS YHHGNHAD P E ERDCPTARTD IGD VQ I L 
SSFVRDHLPDLKPEPAVIESCMYTNTPDEQF1LDRHPKYDNIVI 
GAGFSGHGFKLAPWGKILYELSMKLTPSYDLAPFRISRFPSLG 
KAHL 


6996 


543 


1942 


ETANAE AAARKS AMD W KE VLRRRLAT PNTCPNKKKS EQEL KDE E 
MDLFTKYYSEWKGGRKNTNEFYKTI PRFYYRLPAENEVLLQKLR 
EES RAVFLQRKS RELLDNE E LQNLWF L LDKHQT P PMIGEE AM I N 
YENFLKVGE KAGAKCKQFFTAKVFAKLLHTDS YGR I S I MQ F FNY 
VMRKVWLHQTRIGLSLYBVAGQGYLRESDLENYILELIPTLPQL 
DGLEKSFYSFYVCTAVRKFFFFLDPLRTGKIKIQDILACSFLDD 
LLELRDEELSKESQETNWFSAPSALRVYGQYLNLDKDHNGMLSK 
EELSRYGTATMTNVFLDRVFQECLTYDGEMDYKTYLDFVLALEN 
RKEPAALQYI FKLLDIENKG YLNVFS LNYFFRAIQELMKIHGQD 
PVSFQDVKDE I FDMVKPKD PLKI S LQDL I NSNQGDTVTT I L I DL 
NG FWT YENREAL VANDS ENS ADLDDT 


6997 


370 


1104 


AMELTI FILRLAI YILTFPIiYLLNFLGLWSWl CKKWFPYFLVRF 
TVIYNEQMASKKRELFSNLQEFAGPSGKLSLLEVGCGTGANFKF 
YPPGCRVTCIDPNPNFEKFLIKSIAENRHIiQFERFWAAGENMH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine / V=VaIine, 
W=Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, ; 
\=possible nucleotide insertion) 








QVADGSVDVWCTLVLCSVKNQERILREVCRVLRPGGAFYFMEH 
VAAECSTWNYFWQQVLDPAWHLLFDGCNLTRESWKALERASFSK 
LKLQHIQAPLSWELVRPHIYGYAVK 


699B 


2 


616 


FVSRALLRVRSRRHPAEERAAPGRPEDAPIECPGATNCPEPLWC 
SHLPVPYAPPTMESRGKSASSPKPDTKVPQVTTEAKVPPAADGK 
APLTKPSKKEAPAEKQQPPAAPTTAPAKKTSAKADPALLNNHSN 
LKPAP TVPS S PDATPE PKGPGDGAEEDEAAS GGPGGRGP WS CEN 
FNPLLVAGGVAVAAI AI* I LGVAFLVR KK 


6999 


14 


1591 


GRAGACSRRDTAMSIE I ESSDVIRLIMQYLKENSLHRALATLQE 
ETT VS LNTVDS I ES FVAD I NS GHWDTVLQA I Q S LKLPDKTL I DL 
YEQWLELIELRELGAARSLLRQTDPMIMLKQTQPERYIHLENL 
LARSYFDPREAYPDGSSKEKRRAAIAQALAGEVSWPPSRLMAL 
LGQALKWQQHQGLLPPGMTIDLFRGKAAVKDVEEEKFPTQLSRH 
IKFGQKSHVECARFSPDGQYLVTGSVDGFIEVWNFTTGKIRKDL 
KYQAQDNFMMMDDAVLCMC FS RDTEMIiATGAQDGKI KVW KI Q SG 
QCLRRFERAHSKGVTCLSFSKDSSQILSASFDQTIRIHGLKSGK 
TLKE FRGHSS F VNE AT FTQDGHY 1 1 S ASSDGTVKI WNMKTTECS 
NTFKSLGSTAGTDITWSVILLPKNPEHFWCNRSNTVVIMNMQ 
GQIVRSFSSGKREGGDFVCCALSPRGEWIYCVGEDFVLYCFSTV 
TGKLERTLTVHEKDVIGIAHHPHQNLIATYSEDGLLKLWKP 


7000 


2 


827 


GPGWFIiELMESEGPPESERSEFFSQREEENEEEEAQEPEETGP 
KNPLLQPALTGDVEGLQKI FEDPENPHHEQAMQLLLEED IVGRN 
LL YAACMAGQ S DV I RALAKYG VNLNE KTTRG YTLLHCAAAWGRL 
ETLKALVELD VD I E ALNFRE ERARDVAAR YSQTE C VEFLD WADA 
RLTLKKYIAKVSLAVTDTEKGSGKLLKEDKNTILSACRAKNEWL 
ETHTEAS INELFEQRQQLEDIVTP I FTKMTTPCQVKSAKS VTSH 
DQKRSQDDTSN 


7001 


2056 


844 


RRCL I IAFLKGCF I FI YF I F I FETEFLS CCPG WS AVAQSRL IAN 
FASQVQAI FILPKDSQVGP DVKSEAAP KRALYES VFGSGE I CGP 
TS PKRL C I RP S E P VD AWWS VKHDPL PLLPE ANGHRSTNS P T I 
VS PA I VS PTQD S R PNMSRPL I TRS PAS P LNNQG I PTP AQLTK SN 
APVHIDVGGHMYTSSLATLTKYPESRIGRLFDGTEPIVLDSLKQ 
H YF I DRDGQM FRY I LNFLRTS KLL I P DDFKDYTLL YEEAKYFQL 
QPMLLEMERWKQDRETGRFSRPCECLWRVAPDLGERITLSGDK 
SLIEEVFPEIGDVMCNSVNAGWNHDSTHVIRFPLNGYCHLNSVQ 
VLERLQQRGFEIVGSCGGGVDSSQFSEYVLRRELRRTPRVPSVI 
RIKQEPLD 


7002 


1043 


498 


PMPS S TRWTTS * T YTDTS S AWACRP TTGT CT * TAAPG PTVR WWP 
TP CS RHQSRRRLTC W CSTS RPCGR*GGLC VRTAP TR PTT S AS S S 
SWTSAGTSWPAGRRTGTATSGTATTTSVWPGCGTRMWSTQWSSV 
PRSRSCCSRPATTPPSKPGAPHAPCASSRHLAHGLAPSSPGLPA 
RGAEVC 


7003 


818 


61 


QGRFRAFCWQRDFLQPPGMRLSALLALASKVTLPPHYRYGMSPP 
GS VAD KRKNP P W I RRRP VWE P I SDE DW YLFCGDT VE I LEG KDA 
GKQG KWQ VI RQRNW WVGGLNTH YR Y I GKTMD YRGTMI P S EAP 
LLHRQVKLVDPMDRKPTEI EWRFTEAGERVRVSTRSGRI I PKPE 
F PRADG I VPETW I DG P KDTS VEDALERTYVP CLKTLQEEVMEAM 
GIKETR\NTRRSIGIEPGAEQLLPNFCPSLEG 


7004 


121 


2285 


FLIjPVIiTSRSLRQPAVPHARLGGVEPAAMKSARAKTPRKPTVKK 
G \ PKRTLKTQLG/ YYCRVRPLGFPDQECC IEVINWTTVQLHTPE 
G YRLNRNGDYKETQ YS FKQVFGTHTTQKE LFDWAN P LVNDL IH 
GKNGLLFTYGVTGSGKTHTMTGSPGEGGLLPRCLDMI FNS IGSF 
QAKRYVFKSNDRNSMD IQCE VDALLERQKREAMPNPKTSS S KRQ 
VDPEFADM I TVQE FCKAEEVDEDS VYGVFVS YIB I YNNYIYDLL 
EE VPFDP INPNLHNLNCFVKI KNHNM YVAGCTEVE VKS TEEAFE 
VFWRGQKKRRIANTHLNRESSRSHSVFNI KLVQAPLDADGDNVL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y«Tyrosine, X»Unknown, *=Stop 
Codon, /-.possible nucleotide deletion, 
\=possible nucleotide insertion) 








QE KEQ I T I S QLS LVDLAGSERTNRTRAEGNRLREAGN INQSLMT 
LRTCMDVLRENQM YGTNKMVP YRD S KLTHLFKNYFDG EGKVRM I 
VCVNPKAED YBENLQVMRFAEVTQEVEVARPVDKAI CGLTPGRR 
YRNQPRGP\IGNEPLVTDWLQSFPPLPSCEILDINDEQTLPRL 
IEALBKRHNLRQMMIDEFNKQSNAFKALLQEFDNAVLSKENHMQ 
GKIiNE KEKM I SGQKLE I ERLE KKNKTLE YKI E I LE KTTT I YEED 
KRNLQQELETQNQKLQRQFSDKRRLEARLQGMVTETTMKWEKEC 
ERRVAAKQLEMQNKLWVKDEKLKQLKAIVTEPKTEKPERPSRER 
DREKVTQRSVSPSPVPVSYL 


7005 


63 


876 


RNMALYQRWRCLRLQGLQACRLHTAWSTPPRWtAERLGLFEEL 
WAAQVKRLASMAQKEPRTIKISLPGGQKIDAVAWNTTPYQLARQ 
I SSTLADTAVAAQVNGEP YDLERPLETDSDLR FI/TFDS PEGKAV 
FWHSSTHVLGAAAEQFLGAVLCRGPSTEYGFYHDFFLGKERTIR 
GSELPVLERICQELTAAARPFRRLEASRDQLRQLFKDNPFKLHL 
I EEKVTGPTATVYG CGTLVDLCQGPHLRHTGQ IGGL KLLSNSS S 
LWRSSG 


7006 


22 


898 


NAFGRHSTAVKMAAAAWLQVLPVI LLLLGAHPS PLS FFS AGPAT 
VAAADRS KWH I P I P SGKNYFS FGK I L FRNTT I FLKFDGE PCDLS 
LNI TWYLKSADCYNE I YNFKAEEVELYLEKLKEKRGLSGKYQTS 
S KLFQNCS EltFKTQTFSGD FMHRL P LLGE KQE AKENGTNLT FI G 
DKTAMHEPLQTWQDAPYIFIVHIGISSSKESSKENSLSNLFTMT 
VE VKGP YE YLTLEDYP LMI FFMVMC I VYVLFGVIjWLAWS ACYWR 
DLLRIQFWIGAVIFLGMLEKAVFYAGFQ 


7007 


2 


1001 


AMTVSGPGTPEPRPATPGASSVEQLRKEGNELFKCGDYGGALAA 
YTQAU5LDATPQDQAVLHRNRAACHLKLEDYDKAETEASKAIEK 
DGGDVKALYRRSQALEKI/SRIJDQAVLDLQRC7SLEPKNKVFQEA 
LRNIGGQ I QEKVRYMS STDAKVEQMFQ ILIiDPEEKGTEKKQKAS 
QNLWLAREDAGABKI FRS NGVQLLQRLLDMGE TDLMLAALRTL 
VG ICSEHQSRTVATLS I LGTRRWS ILGVES QAVS IiAACHLIiQV 
MFDALKEGViCKGFRGKEGAI I VGEWKQVWGLLD VTVMEGMGLS Q 
PGQFFGDQTCSCRLFGI RFGDI I LI> 


7008 


70 


1478 


CRSALGHERPPPAHLPAGGRRLQTCPRSCRWLGRPPSGLPPGPR 
S P P PLAG PGQKMVQKKPAELQG FHRS F KGQNP F EliAF SLDQ PDH 
GDSDFGLQCSARPDMPASQPI DI PDAKKRGKKKKRGRATDSFSG 
RFEDVYQLQE DVLGEGAHARVQTC I NL I TSQE YAVKI IE KQPGH 
IRSRVFREVEMLYQCQGHRNVLEL I E FFEEEDRFYLVFE KMRGG 
S I LSH IHKRRHFNELEAS VWQDVAS ALDFLHNKGI AHRDLKPE 
N I LCEHPNQVS PVK I CDFDLGSG I KLNGDCS P I STPELI/TPCGS 
AE YMAPE WEAFSEEAS I YDKRCDL WSLGVI LYTLLSGYPPFVG 
RCGSDCGWDRGEACPACQNMLFES I QEGKYE FPDKDWAHI SCAA 
KDLISKLLVRDAKQRLSAAQVLQHPWVQGCAPENTLPTPMVLQR 
WDSHFLLPPHPCRIHVRPGGLVRTVTVNB 


♦ 7009 


1 


626 


ARQLRNS W VDDF VAAP LIPLSQQIP TGNS LYE S YY KQ VD PAYTG 
RVGASEAALFLKKSGLSDIILGKIWDLADPEGKGFLDKQGFYVA 
LRLVACAQSGHEVTLSNLNLSMP P P KFHDTS S PLMVTPPSAEAH 
WAVRVEEKAKFDG I FESLLP I NGLLSGDKVKP VLMNS KLPLDVL 
GRVWDLSD IDKDGHLVRDEFAVAMHLVYRALE 


7010 


79 


571 


SHTRRAWPETLLSPLCPLLGGGTAMSGGEQKPERYYVGVDVGT 
GS VRAALVDQSGVLLAFADQP I KNW E PQFNHHE QS SEDI WAACC 
WTKKWQG IDLNQ I RGLGFDATCS LWLDKQFHPLPVNQEGDS 
HRNVIMWLDHRAVSQVNRINETKHSVLQYVGG 


7011 


3 


994 


riOtlpnqnqsqtqpllktppavlqpiapqttfgvqtqpqpqsl 
lqaqisaasitpllqtqpqpllqqpqqkagllqppvrivsqpqp 
arrldppsrfsgrndrgdqvpnrkddrsrbrererrrsrerspq 
rkrsrersprrerbrsprrvrrwprytvqfskfsldcpscdmm 
elrrryqnlyipsdffdaqftwvdafplsrpfqlgnycnfyvmh 



585 



WO 01/53312 



PCT7US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C-Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=*possible nucleotide insertion) ! 








RE VESLEKNMAI LD P PDADHL YS AKVMLMAS PS ME DL YHKS CAL 
AEDPQELRDGFQHPARLVKFLVGMKGKDEAMAI GGHWS PSLDGP 
DPEKDPSVL I KT\ AI RCCKALTG 


7012 


1 


2661 


RRAGS VKRGE ARLFG P T ERQS ERPLRP SAARRPEMLS GKKAAAA 
AAAAAAAATGTEAG PGTAGGS ENG S EVAAQ PAGLSG PAE VGPGA 
VGERTPRKKEPPRASPPGGLAEPPGSAGPQAGPTWPGSATPME 
TG IAETPEG \ RRTS RR KRAKVE YREMDES LANLSEDE Y YS EEE R 
NAKAEKEKKLPPPPPQAPPEEENESEPEEPSGVEGAAFQSRLPH 
DRMTSQEAACFPDIISGPQQTQKVFLFIRNRTLQLWLDNPKtQL 
TFEATLQQLEAPYNSDTVLVHRVHSYLERHGLINFGIYKRIKPL 
PTKKTGKVI I IGSGVSGLAAARQLQSFGMDVTLLEARDR VGGR V 
ATFRKGNYVADLGAMVVTGLGGNPMAWS KQVNMEIiAKI KQKCP 
LYEANGQAVPKEKDEMVEQEFNRLLEATSYLSHQLDFNVLNNKP 
VSLGQALEVVIQLQEKHVKDEQIEHWKKIVKTQEELKELLNKMV 
NLKE KI KE LHQQ YKEASE VKP PRD I TAE FL VKS KHRDLTALCKE 
YDELAETQGKLEEKLQELEANPPSDVYLSSRDRQILDWHFANLE 
FANATPLS TLS LKHWDQDDDFE FTGS HLTVRNG YSCVPVALAEG 
LDIKLNTAVRQVRYTASGCEVIAVNrRSTSQTFIYKCDAVLCTL 
PLGVLKQQ P PAVQF VP PL P EW KTS AVQRMGFGNLNKWL CFDR V 
FWDPS VNLFGHVGS TTASRGEL FL FWNL YKAP I LLALVAGEAAG 
I MEN ISDDVI VGRCLAI LKGI FGS SAVPQPKETWSRWRADPWA 
RGS YSYVAAGSSGNDYDLMAQP I TPGPS I PGAPQ PI PRLFFAGE 
HT I RNYPATVHGALLSGLREAGR IADQFLGAMYTLPRQATPGVP 
AQQSPSM 


7013 


1 


2661 
- 


RRAGSVKRGEARLFGPTERQSERPLRPSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAGPGTAGGS ENG S EVAAQ PAGLSGPAE VG PGA 
VGERTPRKKE PPRAS PPGGLAE P PGSAGPQAGPTWPGSATPME 
TGIAETPEG\RRTSRRKRAKVEYREMDESLANLSEDEYYSEEER 
NAKAE KEKKL P P PP PQAP PE EENE S E PEE P S GVEGAAFQS RL PH 
DRMTSQEAACFPDIISGPQQTQKVFLFIRNRTLQLWLDNPKIQL 
TF EATLQQLEAP YNSDT VL VHR VHS YLERHGL I NFG I YKR I KPL 
PTKKTGKVI 1 IGSGVSGLAAARQLQSFGMDVTLLEARDRVGGRV 
AT FRKGNY VADLGAMWTGLGGN P MAWS KQ VNMELAKI KQKCP 
L Y EANGQAVPKE KDEMVEQE FNRLLEATSYLSHQLDFNVLNNKP 
VSLGQALEWIQLQEKHVKDEQIEHWKKIVKTQEELKELLNKMV 
NLKE KI KELHQQ YKEASE VKPPRDI TAE FLVKS KHRDLTALCKE 
YDELAETQGKLEEKLQELEANPPSDVYLSSRDRQILDWHFANLE 
FANATPLS TLSLKHWDQDDDFEFTGSHLT VRNG YS CVP VALAEG 
LD I KLNTAVRQVRYTASG CE V I AVNTRS TSQTF I YKCDAVLCTL 
PLGVLKQQPPAVQFVPPLPEWKTSAVQRMGFGNLNKWLCFDRV 
FWDPS VNLFGHVGS TTASRGELFL FWNL YKAP I LLALVAGEAAG 
IMENISDDVIVGRCLAILKGIFGSSAVPQPKETWSRWRADPWA 
RGS YSYVAAGSSGNDYDLMAQP I TPGPS I PGAPQP I PRLFFAGE 
HTIRNYPATVHGALLSGLREAGRIADQFLGAMYTLPRQATPGVP 
AQQSPSM 


7014 


3 ; 


3950 


DFEVGDKI RI LATLEDG WLEGS LKGRTG I F P YRFVKLCPDTRVE 
ETMALPQEGSLARI PETSLDCLENTLGVEEQRHETS DHEAEEPD 
CIISEAPTSPLGHLTSEYDTDRNSYQDEDTAGGPPRSPGVEWEM 
PLATDSPTSDPTEWNG I SSQPQ VP FHPNLQKSQYYSTVGGSHP 
HSEQYPDLLPLEARTRDYASLPPKRMYSQLKTLQKPVLPLYRGS 
S VSASRWKPRQSS PQLHNLAS YTKKHHTS S VYS ISERLEMKPG 
PQAQGLVMEAATHSQGDGSTDLDSKLTQQLIEFEKSLAGPGTEP 
DKILRHFSIMDFNSEKDIVRGSSKLITEQELPERRKALRPPPPR 
PCTP VS TS PHLL VDQNLKPAP PLWRPS R P APL P P S AQQRTNAV 
SPKLLSRHRPTCETLEKEGPGHMGRSLDQTSPCPLVLVRIEEME 
RDLDMYSRAQEELNLMLEEKQDESSRAETLEDLKFCESNIESLN' 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A^Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine r G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N*Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, YsTyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion , 
\»possible nucleotide insertion) 








MELQQLREMTLLSSQSSSLVAPSGSVSAENPEQRMLEKRAKVIE 
ELLQTERDYIRDLEMCIERIMVPMQQAQVPNIDFEGLFGNMQMV 
I KVS KQLLAALE I S DAVG P VFLGHRDELEGTYKI YCQNHDEAI A 
LLEIYEKDEKIQKHLQDSLADLKSLYNEWGCTNYINLGSFLIKP 
VQRVMRYPLLLMELLNSTPESHPDKVPLTNAVLAVKEINVNINE 
YKRRKDLVLKYRKGDEDS LME KI S KLNIHS 1 1 KKSNRVS SHLKH 
LTG FAPQ I KDE VFEETEKNFRMQERL I KS F I RDLSL YLQH I RE S 
ACVKWAAVSMWDVCMERGHRDLEQFERVHRYISDQLFTNFKER 
TERLVISPLNQLLSMFTGPHKLVQKRFDKLLDFYNCTERAEKLK 
DKKTLEELQSARNNYEALNAQLLDELPKFHQYAQGLFTNCVHGY 
AEAHCDFVHQALEQLKPLLS LLKVAGREGNL I A I FHEEHSRVLQ 
QLQVFTFFPESLPATKKPFERKTIDRQSARKPLLGLPSYMLQSE 
ELRAS LLARYP PE KLFQAERNFNAAQDLDVS LLEGDLVGVI KKK 
DPMGSQNRWLIDNGVTKGFVYSSFLKPYNPRRSHSDASVGSHSS 
TESEHGSSSPRFPRQNSGSTLTFNPN\S\MAVSFTSGSCQKQPQ 
DASPPPKEWDQGTLSASLNPSNSESSPSRCPSDPDSTSQPRSGD 
SADVARDVKQPTATPRSYRNFRHPEIVGYSVPGRNGQSQDLVKG 
CARTAQAPEDRS TEPDGS EAEGNQ VYFAVYTFKARNPNELS VSA 
NQKLKIIiEFKDVTGNTEWWLAEVNGKKGYVPSNYIRKTEYT 


7015 


1842 


513 


RQAWHE WAAPSWRGARLVQSVLRVWQVGPHVARERVI PFSSLL 
GFQRRCVSCVAGSAFSGPRLASASRSNGQGSALDHFLGFSQPDS 
SVTPCVPAVSMNRDEQDVLIiVHHPDMPENSRVLRVVLLGAPNAG 
KSTLSNQLLGRKVFPVSRKVHTTRCQAU3VITEKETQVI lldtp 
GIISPGKQKRHHLELSLLEDPWKSMESADIjVWLVDVSDKWTRN 
QLSPQLLRCLTKYSQIPSVLVMNKVDCLKQKSVLLELTAALTEG 
WNGKKLKMRQAFHSHPGTHCPSPAVKDPNTQSVGNPQRIGWPH 
FKEIFMLSALSQEDVKTLKQYLLTQAQPGPWEYHSAVLTSQTPE 
EICANIIREKLLEHLPQEVPYNVQQKTAVWEEGPGGELVIQQKL 
LVPKESYVKLLIGPKGHVISQIAQEAGHDLMDIFLCDVDIRLSV 
KLLK 


7016 


167 


2513 


I LNAP KPP PPRDS VEAVAAKRDTGGGS WGTGMDVSGQETDWRST 
AFRQKLVSQIEDAMRKAGVAHSKSSKDMESHVFLKAKTRDEYLS 
LVARL I IHFRD I HNKKSQASVSDPMNALQS LTGG P AAGAAG I GM 
PPRGPGQSLGGMGSLGAMGQPMSLSGQPPPGTSGMAPHSMAWS 
TATPQTQLQLQQVAAAAAAATARSS S S SSRRRYSSS SSSSNS KQ 
FQAQQSAMQQ\QFQA\ WQQQQQL \QQQQQQQQHLI KLHHQNQQ 

Q I QQQQQQLQR I aqlqlqqqqqqqqqqqqqqqqalqaqp p iqqp 

PMQQPQPPPSQALPQQLQQMHHTQHHQPPPQPQQPPVAQNQPSQ 
LPPQSQTQPLVSQAQALPGQMLYTQPPLKFVRAPMWQQPPVQP 
QVQQQQTAVQTAQAAQMVAPGVQVSQS SL PMLSS PS PGQQVQTP 
QSMPPPPQPSPQPGQPSSQPNSNVSSGPAPSPSSFLPSPSPQPF 
\QS PVTARTPQNFS VPS PGPLNTPVNPS SVMSPAGSSQAEEQQ Y 
LDKLKQLSKYIEPLRRMINKIDKNEDRKKDLSKMKSLLDILTDP 
SKRCPLKTLQKCE IALEKLKNDMAVPTPPP PPVP PTKQQYLCQP 
LLDAVLAN IRSPVFNHS LYRTFV PAMTAIHGPPI TAP WCTRKR 
RLEDDERQS I PS VLQGE VARLDPKFLVNLDPSHCSNNGTVHL I C 
KLDDKDLPSVPPLELSVPADYPAQSPLWIDRQWQYDANPFLQSV 
HRCMTSRLLQLPDKHS VTALLNTWAQS VHQACLS AA 


7017 


1 


1785 


INLGNTCYMNS VI * AL FMATD FRRQ VLS LNLNGCNS LMKKLQHL 
FAFLAHTQREAYAPR I FFEAS RP P WFTPRSQQDCSE YLRFLLDR 
LHEEEKILKVQASHKPSEILECSETSLQEVASKAAVLTETPRTS 
DGEKTLIEKMFGGKLRTHXRCLNCRSTSQKAEAFTDLSLAFWPS 
YSLEYMSCPDCSQSPSIQDGGIiMQASVPGPSEEPWYNPTTAAF 
I CDSL VNEKT I GS PPNEF YCS ENTS VPNESNKIL VNKD VPQKPG 
GETTPSVTDLLNYFLAPEILTGDNQYYCENCASLQNAEKTMQIT 
EEPEYLILTLLRFSYDQKYHVRRKILDNVSLPLVLELPVKRITS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, - 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline P Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion t 
\=possible nucleotide insertion) 








FSS LS ES WS VDVD FTDLS ENLAKKLKP SGTDEAS CTKL VP Y LLS 
SWVHSGISSESGHYYSYARNITSTDSSYQMYHQSEALALASSQ 
SHLLGRDS PSAVFEQDLENKEMS KEWFLFNDSRVTFTS FQSVQK 
ITSRFPKDTAYVLLYKKQHSTNGLSGNNPTSGLWINGDPPLQKE 
LMDAITKDNKLYLQEQELNARARALQAASASCSFRPNGFDDNDP 
PGSCGPTGGGGGGGFNTVGRLVF 


7018 


484 


1066 


SLVFRGNTWSGEAGHHCSALFNLAAYHQLFVGTERIRAPEIIFQ 
PS L IGEEQAG I AETLQYILDR YPKDVQEMLVQNVFLTGGNTMYP 
GMKARMEKELLEMRPFRSSFQVQLASNPVLDAWYGARDWALNHL 
DDNEVW I TRKE YEEKGGE YX^KEHCASN I YVP I RLPKQAS RS SDA 
QAS S KGS AAGGGGAGE QA 


7019 


1048 


335 


APGGFLVTMVFPAPSPPWMLGCCSHEVTAGPPTLCKDMSALVAA 
RMRH I PLAPGS DWRDLPN I EVRLS DGTMARKLR YTHHDRKNGRS 
SSGALRGVCSCVEAGKACDPAARQFNTLIPWCLPHTGNRHNHWA 
GLYGRLE W DGF FSTT VTNF E PMGKQGR VLH PEQHR WS VRECAR 
SQGFPDTYRLFGNI LDKHRQVGNAVPP PLAKAIGLE I KLCMLAK 
ARE SASAKI KEEEAAKD 


7020 


1 


2154 


FADSKRKS VLLDKI KNLQVALTS KQQS LETAMS FVARNTFKRVR 
NGF LMRKVAVF FSNT PTRAS P Q LRE AVLKLS DAG ITPL FLTRQE 
DRQL INALQ INNTAVGHALVLP AGRDLTDFLENVLTCHVCLDI C 
NIDPSCGFGSWRPS FRDRRAAGS D VDI DMAF I LDS AETTTLFQF 
NEMKKYIAYLVRQLDMSPDPKASQHFARVAWQHAPSESVDNAS 
M PP VKVE F S LTDYG S KE KL VD FLS RGMTQLQGTRALGS AI EYT I 
ENVFESAPNPRDLKIWLMLTGEVPEQQLEEAQRVILQAKCKGY 
FFWLGIGRKVNIKEVYTFASEPNDVFFKLVDKSTELNEEPLMR 
FGRLLPSFVS S ENAFYLS PDIRKQCDWFQGDQPTKNLVKFGHKQ 
VNVPNNVTSSPTSNPVTTTKPVTTTKPVTTTTKPVTTTTKPVTI 
I NQ PS VKPAAAKPAPAKP VAAKP VATKTATVR P P VAVKPATAAK 
PVAAKPAAVRPPAAAAAKPVATKPEVPRPQAAKPAATKPATTKP 
MVKMS REVQVFE I TENS AKLHWERPE PPG P Y F YDLT VTS AHDQS 
LVLKQNLTVTDRVIGGLIiAGQTYHVAWCYLRSQVRATYHGSFS 
TKKSQPPPPQPARSASSSTINLMVSTEPLALTETDICKLPKDEG 
TCRDF I LKWYYDPNTKSCARFWYGGCGGNENKFGSQKECEKVCA 
PVLAKPGVISVMGT 


7021 


2 


338 


VNAVSFFPNGYAFATGSDDATCRLFDLRADQELLLYSHDNIICG 
ITSVAFSKSGRLLLAGYDDFNCNVWDTLKGDRAGVLAGHDNRVS 
CLGVTDDGMAVATGS WDS FLRI WN 


7022 


2 


856 


VYIGSFWSHPLLIPDNRKLFEAEEQDLFRDIQSLPRNAALRKLN 
DLI KRARLAKVHAYI ISSLKKEMPSVFGKDNKKKELVNNLAEIY 
GRIEREHQISPGDFPNLKRMQDQLQAQDFSKFQPLKSKLLEWD 
DMLAHDIAQLMVLVRQEESQRPIQMVKGGAFEGTIiHGPFGHGYG 
EGAGEGIDDAEWVVARDKPMYDEIFYTIjSPVDGKITGANAKKEM 
VRSKLPNSVLGKIWKLADIDKDGMLDDDEFALA1JHLIKVKLEGH 
ELPNELPAHLLPPSKRKVAE 


7023 


2 


748 


AMVFGGWPYVPQYRDIRRTQNADGFSTYVCLVLLVANILRILF 
WFGRRFESPLLWQSAIMILTMLLMLKLCTEVRVANELNARRRSF 
TAADSKDEEVKVAPRRSFLDFDPHHFWQWSSFSDYVQCVIiAFTG 
VAGYI TYLS IDSALFVETLGFLAVLTEAMLGVPQLYRNHRHQST 
EGMS I KMVLMW TS GDAFKTAYFLL KGAPLQFS VCGLIK3 VLVDIjA 
ILGQAYAFARHPQKPAPHAVHPTGTKAL 


7024 


1207 


190 


RTGVTGWAQVWMFGGGGVLSSGEQLQMPVKPERGLGPSDGWLV 
S SRRGS PGTVLGLP FWLLTP VLVSRS I RSMLLLTRS PTAWHRLS 
QLKPP VL PGTLGGQALHLRS WLLS RQGPAETGGQG QPQG PGLRT 
RLLITGLFGAGLGGAWLALRAEKERLQQQKRTEALRQAAVGQGD 
FHLLDHRGRARCKADFRGQWVLMYFGFTHCPDI CPDELEKLVQV 
VRQLEAEPGLPPVQPVFITVDPERDDVEAMARYVQDFHPRLLGL 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
H-Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TGSTKQVAQASHSYRVYYNAGPKDEDQDYIVDHSIAIYLLNPDG 
LFTDYYGRS RSAEQ I SDSVRRHMAAFRS VLS 


7025 


232 


832 


ERNSP IGNNENL * K\ HS LDCLCFRGDWEGNTQPQTLQDNQEECF 
KQ V I RTCE KR PTFNQHTVFNLHQRLNTGDKLNE FKE LGKAF I S G 
SDHTQHQL I HTS E KFCGDKECGNTFLPDSE VIQYQT VHTVKKT Y 
ECKECGKSFSLRSSLTGHKRIHTGEKPFKCKDCX3KAFRFHSQLS 
VHKRIHTGEKSYECKECGKAFSCG 


7026 


328 


1146 


NPNPSIGDIKDIKKAAKSMLDPAHKSHFHPVTPSLVFLCFIFDG 
LHQALLSVGVSKRSNTWGNENEERGTPYASRFKDMPNFIALEK 
S S VLRHCCDLL I GVAAGS S DK I CTS SLQVQRRF ECAMMAS I GRL S 
HGESADLLISCNAESAIGWISSRPWVGELMFTFLFGDFESPLHK 
LRKS S * LPRKHR *QP INAVRMFLDQCMDGS I ALRAI VSE I PVFE 
E KKNNG * KG I GE I F * VWG CTL P PHYWGAVTTNVPKL SNS GKLLG 
QDEQPHIFG 


i 7027 


43 


954 


GRRLQQQQRPEDAEDGAEGGGKRGEAGWEGGYPEIVKENKLFEH 
YYQELKIVPEGEWGQFMDALREPLPATLRITGYKSHAKEILHCL 
KNKYFKELEDLEMDGQKVEVPQPLSWYPEBLAWHTNLSRKILRK 
SPHLEKFHQFLVSETESGNISRQEAVSMIPPLLLNVRPHHKILD 
MCAAPGSKTTQLIEMLHADMNVPFPEGFVIANDVDNKRCYLLVH 
QAKRLS S PCI MWNHD AS S I PRLQ I DVDGRKE I LFYDR I LCDVP 
CSGDGTMRKN IDVWKKWTTLNSLQLHGLQLRIATRGAEQL 


7028 


189 


608 


SRPPPEPEPGTMVEKGSDSSSEKGGVPGTPSTQSLGSRNFIRNS 
KKMQSWYSMIiSPTYKQRNEDFRKLFSKLPEAERLIVDYSCALQR 
EILLQGRLYLSENWICFYSNIFRWETTISIQLKEVTCLKKEKTA 
KLIPNAIQ 


7029 


1343 


40 


VLSSNTEAKQATGTSSKLRHGTGQEKGREGPRCPSGLAQLRLWG 
/ PCPHAGRETGPRASAP I PGS * GHGWHW* RKDGRGERS EG PSAL 
SPHSPSLLNMQQAPTHVGPGMGSQRPRSSWPEQVGVGSQLSRE 
RWRA* RSLPGAAAS ERTEMTXERSP /RPCQG YDSSNWFTQPGKK 
TRKRNSRRNTMVSRGGGCLLYPLQSIMPE*QLR*GAHASPPTQG 
R*GKGGPRSFLTKASGTTHI PTPFFGS I P/RPTRDSGPGTDNS \ 
AAPGQKRGHREA* QGPE PV/ WGRVTTHLQGPAG * TKPLGS \RNW 
VPGPAEGEQGEGAG LEGR P * PLKGCRSTLTFSPQLS IPMVGKKP 
PEGTTASFFP\RSCHSE*RKPPPSCPHAPALSLPHPLPLPLPPL 
PLPLPGAGT*HSARSGRPGQSETGSLCHNCHHCPPHCPKCSPGG 
T 


703 0 


2 


521 


FVCFSAPGSGQGGKRRVNMEIiSAVGERVFAAEALLKRRIRKGRM 
EYLVKWKGWSQKYS TWEPEENI LDARLLAAFEEREREMEIiYGPK 
KRGPKPKTFLLKAQAKAKAKTYEFRSDSARGIRIPYPGRSPQDL 
ASTSRAREGLRN\RVCPRQRAAPAPAAP\PRRGPSGPGPRPG+G 
PGLHFPGPGGPSKHGFVPASEQHQHQQHLPRRGPSGPGPRPG 


7031 


960 


59 


HCSVPGAEWPRKPPAQICPQLTSRPHLSSPRSLSPGCGHSPGPG 
/ CKPS /RHCDELHEGPSRTAALPCGKPQPKHGVEECG/PCPCLA 

SAAAPRPS LGSGQNASGLPAASLPPQDS SQPHKTVPS PARS VP P 
IiGAQARAAPPRLWCPRALVSG*EASPEAVSVAAGPPVPGPTPST 
SGSTASHS RRGC* S PR * TPAP PRRDHGRS AAFE VLTAAASAQP C 
ASQGGPRPTGAGRTPSPLGLPFSRGPPAASARPFCRHPSIi 


7032 


1393 


2104 


RRPGRTEPVEPPPVPPPPRASNSKSRCR*RNLHLAPL*QSPLRK 
SRQIGTSSLPFGRSAGERPRPAATFCLSRGGSSPVFL*PSSSSL 
EPWMKRQFGRLHSLFWKSWQKMNSFLLTPiCLDTSLMSGWRYRQR 
LPRLHTFLKKSLQMASELAPPLPTPAPLASSLPPPPGPPPLLPV 
PLA*LSRSGILVPPNSGFSLSC\PLGDH*GSSGEVRGSCGSPPP 
HHCWVLPPPP * LLLPPR 


7033 


^89 


815 


RSRDCLSSSATSNRARRSKCSGPKRATPLDSGPGP*APPGPSSA 
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amino acid 
residue of 
amino acid 
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Predicted end 
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corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=»Arginine, 
S=Serine, T^Threonine , V«Valine, 
W=Tryptophan, Y«Tyrosine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
Vpossible nucleotide insertion) 








LMMPSSCPWRTGALGPSPAGSRALGRCTSSVGPGSRWLTRTSSP 
GCATRTWRTMRMEPRPLRSRMGESAPGIPAELPSAAPSGPSAPS 
AAAPSAPTTPAAAGPNTL*SRRTAEWCWPPSCSCCWGWC*SWSA 
WDWRRPPLQVSPAPSSSCRASCCWCLESIT+SSSTARSRATGAS 
SSSTCPTSRSDRGAAWTP\SPMGAPLLPCSVPLISREEALQDPR 
NPSP*GVCSGSSGHAGLALGKPPVACSVP 


7034 


92 


1942 


EOTSSMPFRLL I PLGLLCALLPQHHGAPGPDGSAPDPAHYRERV 
KAMFYHAYDSYLENAFPFDELRPLTCDGHDTWGSFSLTLIDALD 
TLL\TLFYFQII*GNVSEFQRWEVLQDSVDFDIDVNASVFETNI 
RWGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPA 
FQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIVEFATLSSL 
TGDPVFEDVARVALMRLWESRSDIGLVGNHIDVLTGKWVAQDAG 
IGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDW 
YLWVQMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYY 
TVWKQFGGLPE F YNI PQG YTVE KREG Y PLRP E LIE S AM YL YRAT 
GDPTLLELGRDAVESIEKISKVECGFATIKDLRDHKLDNRMESF 
FLAETVKYLYLL FDPTNF IHNNGS TFDAVITPYGE CI LGAGGY I 
FNTE AHP I DP AALHCCQRL KEEQWE VE DLMRE FYSLKRSRS KFQ 
KNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPS 
QPFTS KLALLGQVFLDSS * PLDNFFI FI FLRLNYNKLLLAI I KK 
K 


7035 


92 


1942 


EDTSSMPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERV 
KAMFYHAYDSYLENAFPFDEIiRPLTCDGHDTWGSFSLTLlDALD 
TLL\TLFYFQI LGNVSE FQR WEVLQDS VDFD ID VNASVFE TNI 
RWGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPA 
FQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIVEFATLSSL 
TGDPVFEDVARVALMRLWESRSDIGLVGNHIDVLTGKWVAQDAG 
I GAGVDS YFE YLVKGAI LLQDKKLMAM FLEYNKA I RNYTR FDD W- 
YLWVQM YKGTVSM P VFQS LEAYWPGLQ S L I GD IDNAMRT FLNY Y 
TVWKQFGGLPE FYNIPQGYTVEKREGYPLiRPEL I ESAMYL YRAT 
GDPTLLELGRDAVES IEKI SKVECGPATI KDLRDHKLDNRMES F 
FLAETVKYLYLL FDPTNF I HNNG S TFDAV I TP YGE CI LGAGG YI 
FNTEAH P I DPAALHCCQRLKEEQ WEVEDLMREFYS LKRSRS KFQ 
KNTVS SGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLS CPS 
QP FTS KLALLGQVFLDSS * PLDNFF I FI FLRLNYNKLLLAI I KK 
K 


7036 


442 


761 


CLAPLFS CFQI INLHLAPSGRLRWAWLRGPGRN* LPGEGPS I PT 
RNW*ERKAGCSQPC/PAQQHHGRPPGVSPLPRDPHPTTLRPLPP 
P PPPPPPPPRRPPRNRRPG 


7037 


442 


761 


CLAPLFSCFQI INLHLAPSGRLRWAWLRGPGRN*LPGEGPS I PT 
RNW*ERKAGCSQPC/PAQQHHGRPPGVSPLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7038 


155 


891 


GAGAASDMSSGLRAADFPRWKRHISEQLRRRDRLQRQAFEEIIL 
QYNKLLE KSDLHSVLAQ KLQAEKHDVPNRHE I SPGHDGTWNDNQ 
LQEMAQLRIKHQEELTELHKKRGELAQ\RVIDLNNQMQRKDREM 
QMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDA 
LQI TFTALEGKLRKTTEENQELVTRWMAEKAQEANRIiNARE* KR 
LQEAAS PAAERACRS S KGTS TS RTG 


7039 | 


155 


891 


GAGAASDMS SGLRAAD F PRW KRH I S EQLRRRDRLQRQAFEE 1 1 L 
QYNKLLE KSDLHSVLAQKI^QAEKHDVPNRHE I SPGHDGTWNDNQ 
LQEMAQLRIKHQEELTELHKKRGELAQ\RVIDLNNQMQRKDREM 
QMNEAKI AECLQTI S DLETECLDLRTKLCDLERANQTLKDE YDA 
LQ I TFTALEGKLRKTTEENQELVTRWMAEKAQEANRLNARE * KR 
LQEAAS PAAERACRS S KGTSTSRTG 


7040 


34 


789 


KITPPRRPHRCSSGHGSDNSSVLSGELPPAMGKTALFYHSGGSS 
GYESVMRDSEATGSASSAQDSTSENSSSVGGRCRSLKTPKKRSN 
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beg i lining 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=* Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y^Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








PGSQRRRLIPALSLDTSSPVRKPPNSTGVRWVDGPLRSSPRGLG 
EPFEIKVYEIDDVERLQRRRGGASKEAMCFNAKLKILEHRQQRI 
AEVRAKYEWLMKEbEATKQYLMLDPNKWLSEFDLEQVWELDSLE 
YLEALECVTERLESRVNFCKAHLMMITCFD IT 


7041 


1 


5*7 


SGR VAMG RRRAPAGGS IX3RALMRHQTQRS RS HRHTDS WLHT S EL 
NDGYDWGRLNLQSVTEQSSLDDFLATAELAGTEFVAEKLNIKFV 
PAEARTGLLS FEES QR I KKLHE ENKQ PLC I PRRFNWNQNTTPE E 
L KQ AE KDN FLEWRRQL \ VRLEE EQKLI LT P F ERNLDFWRQLWR V 
IERSDIWQIVDA 


7042 


7 


345 


PIHMAAAALRADI\ISPLFPHIQGYLLLSASHG\ATSLHTKGAL 
PLE TVTM YTVI P KS KYVLVKPDTQ YP Y S ENLDE FKRLAENS ASN 
DDLLMAEVAI SDYGDKLTLELREKY 


7043 


2 


2170 


ARGMAARDSDSEEDLVS YGTGLEPLEEGERPKKP I PLQDQTVRD 
E KGRYKR FHGAFSGG FS AG Y FNT VGS KEGWTP S T FVS S RQNRAD 
KSVIX3PEDFMDEEDLSEFGIAPKAIVTTDDFASKTKDRIREKAR 
QLAAATAP I PGATLLDDLITPAKLS VGFELLRKMGWKEGQGVGP 
R VKRRPRRQKPDPG VKI YGCALP PGS S EGS EG E DDD YLP DNVT F 
APKDVTPVDFTPKDNVHGIAYKGLDPHQALFGTSGEHFNLFSGG 
S ERAGDLGE IGLNKGR KLGISGQAFG VGALEEEDDD I YATETL S 
KYDTVLKDEEPGDGLYGWTAPRQYKNQKESEKDLRYVGKILiDGF 
S LAS KP LS S KKI YP P P ELPRD YR P VHY FRPMVAATS ENS HLLQ V 
LSESAGKATPDPGTHSKHQLNASKRAELLGETPIQGSATSVLEF 
LS QXDKER I KSMKQATDLKAAQLKARSLAQN AQSSRAQPS PAAA 
AGHCSWNMALGGGTATLKASNFKPFAKDPEKQKRYDEFLVHMKQ 
GQKDALERCLDPSMTEWERGRERDEFARAALLYASSHSTLSSRF 
THAXE EDDS DQ VEVPRDQEND VGDKQ S AVKMKMFGKLTRDTFE W 
HPDKLLFQ/RLVGLPRVKRDKYSVFNFLTLPETASLPTTQASSE 
KVSQHRGPDKSRKPSRWDTSKHEKKEDSISEFLRLARSKAEPPK 
QQSSPLVNKEEEHAPELSAN 


7044 


276 


734 


EVYLTDE FAKGRKVADLYE LVQ YAGN 1 1 PRL YLLITVG WYVKS 
FPQS R KD I LKDL VE MCRG VQH PLRG L FLRNYLLQ CTRN I LP DEG 
E PTDE ETTGD I S DSMD FVLLNFAEMNKLWVRMQHQGHS RDREKR 
ERERQELR I L VG TNL VRL SQ V 


7045 


3 


513 


LGFKMEALSRAGQEMSLAALKQHDPYITSIADLTGQVALYTFCP 
KANQWEKTDI EGTLFVYRRSASPYHGFTIVNRLNMHNLVE PVNK 
DLE FQLHE PFLL YRNASLS I YS I W FYDKNDCHR I AKLMAD WE E 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


7046 


3 t 


513 


LGFKMEALSRAGQEMSLAALKQHDPYITSIADLTGQVALYTFCP 
KANQWEKTDI EGTLFVYRRSASPYHGFTIVNRLNMHNLVE PVNK 
DLEFQLHEPFLLYRNASLSIYSIWFYDKNDCHRIAKLMADWEE 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


7047 


103 


486 


QMKI EKOGWSEGLTS I KGNCHNFYTAI S KDVT YKELKNLLNSKN 
IML IDVRE I WE ILEYQKI PESINVPLDE VGEALQMNPRD FKEKY 
NEVKPSKSDS / IVFS YLAGVRSKKALDTAISLGFHSYYER 


7048 


92 


627 


FFCLTLLSSWDYRHHATRRVISSPVFTMEDSGKTFSSEEEEANY 
WKDLAMT YKQRAENTQEELRE FQEGS RE YE AELE TQLQQ I E TRN 
RDLLS ENNRLRMELET I KEKFEVQHSEG YRQI S ALEDDLAQTKA 

IKDQLQKYIRELEQANDDLERAKRATDHGLSKTFE\QRLN\QAI 
BKKM 


7049 


393 


938 


KRTGS AS YGGP PPGLGGPATXAS VAGRCSS VGKI PARRCYEDEL 
VPVFE AVGR I YELRLMMD FDGKNRGYAFVM YCHKHEAKRAVREL 
NNYE I RPGRLLGVCCSVDNCRLFIGGI PKMKKREE ILEE IAKVT 
EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCHWLGRKLIAWX 
ASSLWG 


7050 


393 


938 


KRTGS AS YGG P PPG LGG PATXAS VAG RCSS VG K I PARRC YEDE L 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P*Proline, Q=Glutamine, R=Arginine, 
SaSerine, T=Threonine, V»Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *= 3 Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








VP VPEAVGR I YE LRLMMDFDG KNRG YAF VM Y CHKHEAKRAVR EL 
NNYEIRPGRLLGVCCSVDNCRLFIGGIPKMKKREEILEBIAKVT 
EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCHWLGRKLIAWX 
ASSLWG 


7051 


119 


816 


KKMNLAEICDNAKKGREYALLGNYDSSMVYYQGVMQQIQRHCQS 
VRDPAIKGKWQQVRQELLEEYEQVKSIVGTLESFKIDKPPDFPV 
SCQDEPFRDPAVWPPPVPAEHRAPPQIRR/RQSRSKTSEERNGR 
SRSPGTCRPSr\PISKSEKPSTSRDKDYRARGRDDKGRKNMQDG 
ASDGEMPKFDGAGYDKDLVEALERDIVSRNPSIHWDDIADLEEA 
KKLLREAGVLPMWM 


/ Ub-ti 


467 


715 


SCPGRGKMSKLLNPEEMTSRDYYFDSYAHFGIHEEMLKDEVRTL 
TYRNSMYHNKHVFKDKWLDVGSGTGILSNFAARQGPRR 


7053 


467 


715 


SCPGRGKMSKLLNPEEMTSRDYYFDSYAHFGIHEEMLKDEVRTL 
TYRNSMYHNKHVFKDKWLDVGSGTGILSMFAARQGPRR 


7054 


1 


1036 


GTSQRS RE TDARRRS AGAE PTARL P W P AALEE WPSCP CE PLGPG " 
RRCRWDAMEYDEKLARFRQAHLNPFNKQSGPRQHEQGPGEEVPD 
VTPEEALPELPPGEPEFRCPERVMDLGLSEDHFSRPVGLFLASD 
VQQLRQAIEECKQVILELPEQSEKQKDAWRLIHLRLKLQELKD 
PNEDEPNIRVLLEHRFYKEKSKSVKQTCDKCNTriWGLIQTWYT 
CTGC Y YRCHS KCLNL I S KP CVSS KVSHQAE YE LN I CP ETGLDSQ 
DYRCAECRAPI/CS/DGWPSEARQCDYTGQYYCSHCHWNDLAV 
I PAR WHNWDFE PRKVS RCS MR YLALMVSR P VLRLRE I N 


7055 


2 


527 


DSRRVSWRSWLANE/WGKHLCLFIWLSMNVLLFWKTFLLYNQGP 
EYHYLHQMLG/ALCLSRASASVLNLNCSLILLPMCRTLLAYLRG 
S QKV P S RRTRRLLDKSRT FH I TCG AT I CI FS GVHVAAH L VNALN 

FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEWIiFL 
M 


7056 


2 


527 


DSRRVSWRSWLANE/WGKHLCLFIWLSMWVttFWKTFLLYNQGP " 
E YH YLHQMLG /ALCLS RASAS VLNLNCSLILL PM CRTLLA YLRG 
S QKVPSRRTRRLLDKSRTFHI TCGATI C I FSGVHVAAHL VNALN 

FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEVVLFL 
M 


7057 


1368 


431 

r 


GIYLHVNEKIPRPTCIGDRQENDKENLNLENHRDQELLHASCQA 
SGEVPSQASLRGFFTEDEPGCFGEGENLPEALQNIQDEGTGEQL 
S PQER I SEKQLGQHLPNPHSGEMSTMWLEEKRETSQKGQPRAPM 
AQKLPTCRECGKTFYRNSQLI FHQRTHTGETYFQCTI CKKAFLR 
S SDF VKHQRTHTGE KP CKCD YCG KGFS DFSG LRHH E K I HTGE KP 
YKCPICEKSFIQRSNFNRHQRVHTGEKPYKCSHCGKSFSWSSSL 
DKHQRSHLGKKPFQ * P VTKLS FP I S I S QPSHKNTQLHQEEL CLR - 
GYPC 


7058 


1 


469 


FSGFGAVPDALGCRMSDLRITEAFLYMDYLCFRALCCKGPPPAR 
PE YDLVC IGLTGSGKTS LLS KLCS ES PDNWSTTGFS I KAVP FQ 
NAI LNVKELGGADN I RKYWS R YYQGSQGVI FVLDS AS S EDDLE A 


7059 


1 


1178 


WPAFPRQPAAAAMDALLGTGPRRARGCLGAAGPTSSGRAARTPA 
APWARFSAWLECVCWTFDLELGQALELVYPNDFRLTDKEKSSI 
C YLS FPDSHSGCLGDTQFSFRMRQCGGQRS PWHADDRHYNSRAP 
VALQREPAHYFGYVYFRQVKDSSVKRGYFQKSLVLVSRLPFVRL 
FQALLSLIAPEYFDlOUAPCLEAVCSEIDQWPAPAPGQTLNljPVM 
GWVQVRIPSRVDKSESSPPKQFDQENLLPAPWLASVHELDLF 
RCFRPVLTHMQTLWE LMLLGE PLLVLAPS PDVS SEMVLALTS CL 
QPLRFCCDFRPYFTIHDSEFKEFTTRTQAPPNWLGVTNPFFIK 
TLQHWPHILRVGEPKMSGDLPKQVKLKKPFKV*RPWDTKP j 


706b 


90 


1670 


SVNLPPSLWPWEEAMDSTKSEPLKGSPEAEDGNIEYKKLVNPSQ 
YR FEHLVTQM KWRLQEGRGE AVYQ I G VEDNGLL VGLAEEEMRAS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
W=Tryptophan, Y« Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








L KT LHRMAE KVGAD I TVLRE R E VD YDS DM P R K I T E VIi VRKVP DN 
QQFLDLRVAVLGNVDSGKSTLLGVLTQGELDNGRGRARLNLFRH 
LHE I QS GRTS S I S F E I LGFNS KGE VHG I NGTQWGQTLRMGW * * * 
RT* DGGRVWRLFE I V*MNALRGL* TSSAPLRKSMGNQLN* IKNG 
VKIKRQGHPGNGLGPGNSEGVGRAGRRH*GPWALGQWNYSDSR 
TAEEICESSSKMITFIDLAGHHKYLHTTIFGLTSYCPDCALLLV 
S ANTG I AGTTREHLGLAIjALKVP F FI WS KI DLCAKTT VERTVR 
QLERVIjKQPGCHKVPMLVTSEDDAVTAAQQFAQSPNVTPIFTLS 

svsgesldllkvflnilppltnskeqeelmqqltefqvdeiytv 
pevgtwggtlsr*idliatlptqpspiysktswpkggdpgi 


7061 


364 


710 


armpsplgppclpvmdpettleepetarlrfrgfcyqevagpre 

ALARIiRELCCQWLQPEAHSKEQMLEMLVLEQFLGTLPPEIQAWV 
RGQRPGSPEEAAALVEGLQHDP+ARMPSPLGPPCLPVMDPETTL 
EEPETARLRFRGFCYQEVAGPREALARLRELCCQWLQPEAHSKE 
QMLEMLVLEQFLGTLP PE IQAWVRGQRPGSPEEAAALVEGLQHD 
PGQLLG 


7062 


71 


744 


AKAGTNLE RLHWLS YFFC I PKHKLKSSQKDKVRQFMACTQAGER 
TAI YCLTQNEWRLDEATDSFFQNPDSLHRES MRNAVDKKKLERL 
YGR YKDPQDENKIG VDGI QQFCDDLS LDPAS I S VLVI AWKFRAA 
TQCEFSRKEFLDGMTELGCDSMEKLKALLPRLEQELKDTAKFKD 
FYQFTFTFAKNPGOKGIiDL*MAGAYWKLVLSGRFKFLYLWNTFL 
MEHH 


7063 


2 


562 


LRT VPDLPGRR FRAMRTGQRR * PE LP PDMNS LE QAEDLKAFERR 
LTEYIHCLQPATGRWRMLLIWSVCTATGAWNWLIDPETQKVSF 
FTSLWNHPFFTISCITLIGLFFAGIHKRWAPSIIAARCRTVLA 
EYNMSCDDTGKLILKPRPHVQ*QSSLIVMGLKIAFLRISDTAKS 
HKGFLLRLDM 


7064 


300 * 


884 


RDTGSDPSSTRRLCSTCCTGH*PAEPIASPHPSRGTCPPASSAS 
SRRTGCWTCP PESGHAQARRSRRASASRWGARGAVRSAVAARGC 
SSRAGRWLET PGRRRGP PACAAAAGRLRG PAP * AAPPTASVPAR 
CRCPAARTGAPAAATWLRRRLSGLRAPALGRRRS PGPSPKSAAP 
PLLTPLGAGRAGGSRANS 


7065 


1 


555 


ATTTHSARRSGRGAAAEAAASAAGGRQKGPDRKAWEGRRTTPGG 
RSQSEPKAPPPQKRSEAAFASMAHSPVAVQVPGMQNNIADPEEL 
FTKLERIGKGS FGE VFKG IDNRTQQ VYAIKI I DLEEABDE I EDI 
QQE I T VLSQ CDS S YVTKYYGS YLKGS KLWI IME YLGGGS ALDLL 
RAGPFDEFQ 


7066 


356 


676 


PGPQRGPWRAREGGHPLDPADHPRAPASLRSNVRAATMMQI CDT 
YNQKHSLFNAMNRFIGAVNNMDQTVMVPSLLRDVPLADPGLDND 
VGVEVGGSGGCLEERTPP 


7067 


152 


973 


KENITMATEIGSPPRFFHMPRFQHQAPRQLFYKRPDFAQQQAMQ 
QLTFDGKRMRKAVNRKTIDYNPSVIKYLENRIWQRDQRDMRAIQ 
PDAG Y YNDLVP P I GMLNN PMNAVTTKFVRTS TNKVKCP VFWRW 
TPEGRRLVTGASSGEFTLWNGLTFKFETILQAHDSPVRAMTWSH 
NDMWMLTADHGGYVKYWQSNMNNVKMFQAHKEAI REARF IHN I P 
FSWPIVMVKLFSKCILGAEMHGLCQFLGNFLHPINTIFFFVFT 
HSPFCWAPF 


7068 


222 


816 


DTMKEYVLLLFLALCSAKPFFSPSHIALKNMMLKDMEDTDDDDD 
DDDDDDDDDDEDNSLFPTREPRSHFFPFDLFPMCPFGCQCYSRV 
VHCSDLGLTSVPTNIPFDTRMLDLQNNKIKEIKENDFKGLTSLY 
GLILNNNKLTKIHPKAFLTTKKLRRLYLSHNQLSEIPLNIjPKSL 
AELR I HENKVKKI QKDTFKKK 


7069 


1147 


1765 


frdhrryfyvneqsgesqwefpdgeeeeeesqaqenrdetlakq 
tlkdktgtdsnstessetstgslckesfsgqvsssslmpltpfw 
tll-qsnvpvlqpplplemppppppppespppppppppapkmppp 
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rieaicceu 
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to illol 

amino acid 
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amino acid 
sequence 
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nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R»Arginine, 
S«=Serine, T=*Threonine , V= Valine, 
W=Tryptophan, Y=Tyrosine , X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EKTKKGRKDKAKKSKTKMPSLVKKWQSIQRELDEEDNSSSSEED 
RV3 TAQKR I EEWKQQQLVS GMAE RNANFEA 




1 


547 


DGTMEDSEAVQRATAL I EQRLAQE EENEKLRGDARQKLPMDLLV 
LEDE KHHGAQS AALQKVKGQER VR KTS LDL RRE 1 1 DVGG IQNL I 
ELRKKRKQKKRDALAASHE P PPE PEE I TGP VDEETFLKAAVEGK 
MKVIEKFLADGGSADTCDQFRRTALHRASLEGHME1LEKLLDNG 
ATVDPQ 


7071 


2 


921 


ARGTLRALE TAXKVGKVGANGQ KAAG P S ADS VTENKI GS P PKTP 
VSNVAATSAGPSNVGTELNS VPQKSS P FLTRVPAYPPHSENI QY 
FQDPRTQIPFEVPQYPQTGYYPPPPTVPAGVAPCVPRFVRSNNV 
PESSLPPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRI WRPPMYQRDD 1 1 RSNSLP PMDVMHS S VYQT 
SLRERYNSLDGYYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 
IRRKPDQWAQ YHTQ KAPLVS STLP VATQSPT PPS TLNRGEGS 


7072 


2 


921 


ARGTLRAIiETAKKVGKVGANGQKAAGPSADS VTENKI GSP PKTP 
VSNVAATSAGPSNVGTELNSVPQKSSPFLTRVPAYPPHSENIQY 
FQDPRTQIPFEVPQYPQTGYYPPPPTVPAGVAPCVPRFVRSNNV 
PESSLPPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRIWRPPMYQRDDIIRSNSLPPMDVMHSSVYQT 
SLRER YNS LDGYYSVACQ P P S E PRTT VP LPRE PCGHLKTS CEEQ 
IRRKPDQWAQ YHTQ KAPLVS STLPVATQS PTP PS TLNRGEGS 


7073 


50 


504 


LAHGS FG VSD F P APAAAP AHTLTS FSGS LSPQ FR K PLGRAPAMP 
LVR YRKWI LG YRCVGKTS LAHQFVEGE FSEGYDPTVENTYS KI 
VTLGKDE FHLHLVDTAGQDE YS I LP YS F I IGVHGYVLVYSVTSL 
HSFQVI ESLYQKLHEGHGK 


7074 


2 63 


1003 


VCPVLCSTRQEPGHS S LVT YFGKPTRRKEFIiLGHCIAAGKMNIS 
VDLETNYAELVLDVGRVTLGENSRKKMKDCKLRKKQNERVSRAM 
CALLNSGGGVI KAE I ENED YS YT KDG I GLDLENS FSNILLFVPE 
YLDFMQNGNYFLIFVKSWSLNTSGLRITTLSSNLYKRDITSAKV 
MNATAAIiEFLKDMKKTRGRLYLRPELLAKRPRVDIQEENNMKAL 
AGVFFDRTELDRKEKLTFTES THVE I 


7075 


598 


1005 


NYINFFFRKEYPPHVQKVEINPVRLSRLQGVERIMKKTEESESQ 
VEPEI KRKVQQKRHCST YQPTPPLSPAS KKCLTHLEDLQRNCRQ 
AITLNESTGPLLRTSIHQNSGGQKSQNTGLTTKKFYGNNVEKVP 
IDII 


7076 


279 


1049 


LQSESSNAAEGNEQRHEDEQRS KRGG WS KGRKRKKPLRDSNAPK 
SPLTGYVRFMNERREQLRAKRPEVPFPEITRMLGNEWSKLPPEE 
KQRYLDEADRDKERYMKELEQYQKTEAYKVFSRKTQDRQKGKSH 
RQDAARQATHDHEKETEVKERS VFDI P I FTESFLNHS KAREAEL 
RQLRKSNMEFEERNAALQKHVESMRTAVEKLEVDVIQERSRNTV 
LQQHLETLRQVLTSSFASMPLPEXGETPTVDTIDSYM 


nc\nn 
t\) i f 


3 


1119 


SSMGSNSE INGLALRKTDKYGFLGGSQYSGSLKSS I PVDVARQR 
ELKWLDMFSNWDKWLSRRFQKVKLRCRKG I PS SLRAKAWQYLSN 

RGGHGQQDL YRI L KAYT I YR P D EGYCQAQAP VAAVLLMHMPAEQ 
AFWCLVQICDKYLPGYYSAGLEAIQLDGEIFFALLRRASPLAHR 
HLRRQRIDPVLYMTEWFMCIFARTLPWASVLRVWDMFFCEGVKI 
IFRVALVLLRHTLGSVEKLRSCQGMYETMEQLRNLPQQCMQEDF 
LVHEVTNLPVTEALIERENAAQLKKWRETRGELQYRPSRRLHGS 
RAIHEERRRQQPPLGPSSS 


7078 


483 


767 


FQGQRMAGEQKPSSNLLEQFILLAKGTSGSALTALISQVLEAPG 
VYVFGELLELANVQBLAEGANAAYLQLLNLFAYGTYPDY IANKE 
SLPELY 


7079 


2 


376 


SWEFKRPKEPSGSDGESDGPIDVGQEGQLSQMARPLSTPSSSQ 
MQARKKRRGI IEKRRRDRINSSLSELRRLVPTAFEKQGSSKLEK 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PaProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\-poseible nucleotide insertion) 








AE VLQMTVDHLKMLHATGGTGTHALLFQAS FI QQI F 


7080 


200 


595 


VQLPLEAPCLSLLSCRDHSGGNRDIiSRKHRDCRVYGSPQDGIPY 
LTHPLCHQDWSVGRLQIRALATPGHTQGHLVYLLDGEPYKGPS 
CLFSGDLLFLSGCGEFPRKKEELGEEGBTEVRAATVPWRALKP ' 


7081 


213 


506 


AVTEEEMILNSLSLCYHNKLIIiAPMVRVGTLPMRliLALDYGADI "" 

VYCEELIDLKMIQCKRWNEVLSTVDFVAPDDRWFRTCEREQN 

RWFQMGTS 


7082 


3 


1137 


APSRNTMLMAWCRGPVLLC^RQGLGTNSFLHGLGQkPFEGAkSL 
CCRSSPRDLRDGEREHEAAQRKAPGAESCPSLPLSISDIGTGCL 
SSLENLRLPTLREESSPRELEDSSGDQGRCGPTHQGSEDPSMLS 
QAQSATEVEERHVSPSCSTSRERPFQAGELILAETGEGETKFKK 
L FRLNNFGLLNSNWGAVP FGK I VGKFPGQ I LRS S FG KQ YMLRRP 
ALED YWLMKRGTAI TFPKDINM I LSMMD INPGDTVLEAGSGSG 
GMS L FLS KAVGSQGRV I S FE VRKDHHDLAKKN Y KHWRDS WKLSH 
VEE WPDNVDF IHKD ISGATEDI KSLTFDAVALDMLNPHVTLPVF 
YPHLKHGGVCPVYWNITQVIELLD 


7083 


115 


541 


RSNAVQLTRME YAMKSLS LLYPKS LSRHVS VRTS WTQQLLSEP " 
S PKAPRARPCRVSTADRS VRKG I MAYSLEDLLLKVRDTLMLADK 
PFFLVLEEDGTTVETEEYFQALAGDTVFMVLQKGQKWQPPSEQG 
TRHPLSLSHK 1 


7084 


3 


522 


NSVSVSSQSRFLASVPGTGVQRSAAADMAASTAAGKQRIPKVAK 
VKNKAPAEVQ ITAEQLLREAKERE LELLP PP PQQKITDEEELND 
YKLRKRKTFEDNI RKNRTVI SNW I KYAQWEES LKE IQRARS I YE 
RALDVDYRNITLWLKYAEMEMKNRQVNHARNIWDRAITTL 


7085 


243 


1499 


RQLARLRRRGWRS PFGGAP MAH I T I NQ YLQQVYE AI DS RBGAS C 
AELVSFKHPHVANPRLQMASPEEKCQQVLEPPYDEMFAAHLRCT 
YAVGNHDF I EAYKCQTVI VQSFLRAFQAHKEENWALPVMYAVAL 
DLRVFANNADQQLVKKGKS KVGDMLEKAAELLMS CFRVCASDTR 
AGIEDSKKWGMLFLVNQLFKIYFKINKLHLCKPLIRAIDSSNLK 
DDYS TAQR VTYKYYVGRKAMFDSDFKQAEE YLS FAFEHCHRSSQ 
KNKRMILIYLLPVKMLLGHMPTVELLKKYHLMQFAEVTRAVSEG 
NLLLLHE ALAKHEAFF I RCG I FL I LEKLKI IT YRNLFKKVYLLL 
KTHQLS LDAFLVALKFMQVEDVD I DEVQCILANL I YMGHVKGYI 
SHQHQKLWSKQNPFPPLSTGC 


7086 


256 


525 


ILAARMGKQNSKLRPEVMQDIiLESTDFTEHEIQEWYKGFLRDCP 
SGHLSMEE FKKI YGNFFP YGDAS KFAEHVFRT FDANGDGT I D FR 
EF 


7087 


166 


723 


LSGS SAGKVAAPCVPPSNHELVP I TTENAPKNWDKGEGASRGG 
NTRKSLEDNGSTRVTPSVQPHLQPIRNMSVSRTMEDSCELDI)VY 
VTER 1 1 AVS FPS TANEENFRSNLRE VAQMLKSKHGGNYLL FNLS 
ERRPDITKLHAKVLEFGWPDLHTPALEKICSICKAMDTWLNAHP 
HRCRVLHNKG 


7088 


104 


759 


GTSAAS PSSLLEMAGEITETGELYSSYVGLVYMFNLIVGTGALT 
MPKAFATAGWLVS LVLLVFLGFMS FMTTTFVI EAMAAANAQLHW 
KRMENLKEEEDDDSSTASDSDVLIRDNYERAEKRPILSVQRRGS 
PNP FE I TDRVEMGQMASMFFNKVGVNLF YFC 1 1 VYLYGDLAI YA 
AAVP FS LMQVTCSATGNDS CGVEADTKYNDTDRC WG PLRRVD 


7089 


33 


1775 


SVCWEDRYLKARMEESPLSRAPSRGGVNFLNVARTYIPNTKVEC 
HYTLP PGTMPSASDWIG I FKVEAACVRDYHTFVWSS VPESTTDG 
SPIHTSVQFQASYLPKPGAQLYQFRYVNRQGQVCGQSPPFQFRE 
PRPMDE LVTLEE ADGGSD I LL WPKATVLQNQLDE S QQE RNDLM 
QLKLQLEGQVTELRSRVQELERALATARQEHTELMEQYKGISRS 
HGEITEERDILSRQQGDHVARILELEDDIQTISEKVLTKEVELD 
RLRDTVKALTREQEKLLGQLKEVQADKEQSEAELQVAQQENHHL 
NLDLKEAKSWQEEQSAQAQRLKDECVAQMKDTLGQAQQRVAELEP 
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Amino acid segment containing signal peptide 
(A=Alanine, OCyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X»Unknown, *«Stop 
Codon, /«possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKEQLRGAQELAASSQQKATLLGEELASAAAARDRTIAELHRSR 
LEVAEVNGKLAELGLHLKEEKCQWSKERAGLLQSVEAEKDEQLK 
LSAEILRLEKAVQEERTQNQVFKTELAREKDSSLVQLSESKREfi 
TBLRSALRVLQKEKEQLQEEKQELLEYMRKLEARLEKVADEKWN 
EDATTEDEEAAVGLSCPAALTDSEDESPEDMRLHPMAFVSVETQ 
ASLLLGLE 


7090 


33 


1775 


svcwedrylkarmeesplsrapsrggvnflnvartyipntkvec 
hytlppgtmpsasdwigifkveaacvrdyhtfvwssvpesttdg 
spihtsvqfqasylpkpgaqlyqfryvnrqgqvcgqsppfqfre 
prpmdelvtleeadggsdillwpkatvlqnqldesqqerndlm 
qlklqlegqvtelrsrvqeleralatarqehtelmeqykgisrs 
hgeiteerdilsrqqgdhvarileleddiqtisekvltkeveld 
rlrdtvkaltreqekllgqlkevqadkeqseaelqvaqqenhhl 
nldlkeakswqeeqsaqaqrlkdkvaqmkdtlgqaqqrvaelep 
lkeqlrgaqelaassqqkatllgeeiasaaaardrtiaelhrsr 
levaevngklaelglhlkeekcqwskeragllqsveaekdkilk 
lsaeilrlekavqeertqnqvfktelarekdsslvqlseskrel 
telrs alrvlqke keqlqee kqelleymrklearlekvade kwn 
edattedeeaavglscpaaltdsedespedmrlhpmafvsvetq 
aslllgle 


7091 


186 


1076 


EGMLTREHRCGRSEEQELEPWPSPKKARSGRWLRNGFKRKMEEP 
EEPADSGQSLVPVYIYSPEYVSMCDSLAKIPKRASMVHSLIEAY 
ALHKQMRIVKPKVASMEEMATFHTDAYLQHLQKVSQEGDDDHPD 
SIEYGLGYDCPATEGIFDYAAAIGGATITAAQCLIDGMCKVAIN 
WSGG WHHAKKDEASG FC YLNDAVLG I LRLRRKFERI L YVDLDLH 
HGDGVEDAFS FTS KVMTVS LHKFS PGFFPGTGDVSDVGLGKGR Y 
YSVNVPIQDGIQDEKYYQI CERYEPPAPNPGL 


7092 


522 


809 


KQGINEDQEBSQKPRLGEGCEPISKRQMKKLIKQKQWEEQRELR 
KQKRKEKRKRKKLERQCQMEPNSDGHDRKRVRRDWHSTLRLII 
DCSFDXLM 


7093 


454 


655 


NFGVSGVELAQQASMVRMSFVIAACQLVLGLLMTSLTESSIQNS 
ECPQLCVCEIRPWFTPQSTYREA 


7094 


2 


508 


FVRSMHWGVGFASSRPCWDLSWNQSISFFGWWAGSEEPFSFYG 
DI IAFPLQDYGG IMAGLGSDPWWKKTLYLTGGALLAAAAYLLHE 
LLVIRKQQE I DS KDAII LHQFARPNNGVPSLS PFCLKMETYLRM 
ADLPYQNYFGGKLSAQGKMPWIEYNHEKVSGTEFI I 


7095 


1 


411 


IASSLPKMASLLQSDRVLYLVQGEKKVRAPLSQLYFCRYCSELR 
SLECVSHEVDSHYCPSCLENMPSAEAKLKKNRCANCFDCPGCMH 
TLSTRATS I S TQL PDDPAKTTMKKAYYLACG F CRWTSRD VGMAD 
KSVGE 


7096 


224 


2067 


ETRSLAVQEKPSQAGRRRSSRISFAGALFLTRFLLQELLLNNFC 
S AMSPAPDAAPAPAS ISLFDLSADAPVFQGLSLVSHAPGEALAR 
APRTSCSGSGBRESPERKLLQGPMDISEKLFCSTCDQTFQNHQE 
QREHYKLDWHRFNLKQRLKDKPLLSALDFEKQSSTGDLSS ISGS 
EDSDSASEEDLQTLDRERATFEKLSRPPGFYPHRVLPQNAQGQF 
LYAYRCVLGPHQDPPEEAELLLC2NLQSKGPRDCWLMAAAGHFA 
GAI FQGREWTHKTFHR YTVRAKRGTAQGLRDARGG PSHSAGAN 
LRRYNEATLYKDVRDLLAGP S WAKALE E AGT I LLRAPRS GR SL F 
FGGKGAPLQRGDPRLWDI PLATRRPTFQELQRVLHKLTTLHVYE 
EDPREAVRLHS PQTHWKTVREERKKPTEE E I RKI CRDEKEALGQ 
NEES P KQGSGS EGEDGFQVE LELVELTVGTLDLCESE VLPKRRR 
RKRNKKEKSRDQEAGAHRTLLQQTQEEEPSTQSSQAVAAPLGPL 
LDEAKAPGQPELWNALLAACRAGDVGVLKLQLAPS PADPRVLS L 
L5APLGSGGFTLLHAAAAAGRGSWRLLLEAGADPTVQCQDH 


7097 


256 


1228 


IRTKSAATWEAWPQCGREGSRIITEPCEANAGSRQELQTERISS 
FLAAQGDQAFHSGLETNNSNS ELPLRVGLKVAQGS PLMGGQVSA 
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Amino acid segment containing signal peptide 
<A= Alanine, C«Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Jueucine, M=Methionine / N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y= Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SNS FSRLHCRNANEDWMSAIiCPRLWDVPLHHLS IPGSHDTMTYC 
LNKKS PISHEESRLLQLLNKALPCI TRPWLKWS VTQALDVTEQ 
LDAGVRYLDLRIAHMLEGSEKNLHFVHMVYTTALVEDTLTEISE 
WLERHPREWILACRNFEGLSEDLHEYLVACIKNIFGDMLCPRG 
E VPTLRQLWSRGQQ V I VS YEDE S S LRRHHBLWPG VP YW WGNRVK 
TEALIRYLETMKSCGR 


7098 


82 


956 


S S F LKRCRKY LGCWG IPS EQSLFS TLEEPRDKE I DNYCVMRLQT 
EARSGFWAPNRFPVNICRMTAVDGDRGGSSRETCRCHFHPSLEA 
LVLLLQDWQPGGVG I cts FLG i s walldyhralrtclps KPLLG 
LGS S VI YFLWNLLLLWPRVLAVALFS ALFPS YVALHFLGLWLVL 
LLWVWLQGTDFMPDPSSEWLYRVTVATILYFSWFNVAEGRTRGR 
AI IHFAFLLSDS ILLVATWVTHSSWLPSGIPLQLWLPVGCGCFF 
LGLALRLVYYHWLHPSCCWKPDPDQVD 


7099 


992 


210 


LFRLAPGFLRS XiARQG YHQI WAFP FLPSGATAT WPAASRS RSLA 
ARSLPRSPARPGPNDALLGEHDFRGQGVRAQRFRFSEEPGPGAD 
GAVLE VHVPQ I GAGVS LPG I LAAKCGAEVI LSDSSEL PHCLEVC 
RQS CQMNNL PHLQ WGLTWGHI S WDLLALP PQD 1 1 LASD VFFEP 
EDFEDILATIYFLMHKNPKVQLWSTYQVRSADWSLEALLYKWDM 
KCVHI P LES FDAD KED I AES TLPGRHT VEML VI S FAKDS L 


7100 


205 


671 


ANGGFWEAAPGSEVSLPLWPTASHSKTTALGIGSAPPPHLSVL 
FLFS F P PQLGD P LEAF P VFKKYDRNG LNVS I E CKR VSGL E PATV 
DWAFDLTKTNMQTMYEQSEWGWKDREKREEMTDDRAWYLIAWEN 
SSVPVAFSHFRFDVERGDEVLYW 


7101 


2 


503 


WRGG PRRAKRLAGGAVGW VLLVRG VH S VRAGGGRP PRAADMKKD 
VRILLVGEPRVGKTSLIMSLVSEEFPEEVPPRAEEITIPADVTP 
ERVPTHIVDYSEAEQSDEQLHQEISQANVICIVYAVNNKHSIDK 
VTSRWIPLINERTDKDSRLPLILGGNKSDLVEYSR 


7102 


2 


503 


WRGG PRRAKR LAGG AVGW VLLVRG VHS VRAGGGRP PRAADMKKD 
VR I LLVGEPRVGKTSLI MSLVSEBFPEEVPPRAEE I TI PADVTP 
ERVPTHIVDYSEAEQSDEQLHQEISQANVICIVYAVNNKHSIDK 
VTSRWIPLINERTDKDSRLPLILGGNRSDLVEYSR 


7103 


119 


438 


GSQSSVAVNIRSGTDEESMDLMNGQASSVNIAATASEKSSSSES 
LSD KG S E LKKS FDAWFDVLKVTPE E YAGQ I TLMD VP VFKAI QP 
DELSSCGWNKKEKYSSAP 


7104 


1670 


795 


RLWEHRSVS AGASGWGLSS PGCLLLHPSLPEEERVD I LINNAGV 
MRCPHWTTEDGFEMQFGVNHLGEAWAGAAPWVQAILPRRPPKVL 
GF*V*VKSDLFIILNPGHFLLTNLLLDKLKASAPSRIINLSSLA 
HVAGH I D FDDLNWQTRKYNT KAAYCQS \KLAIVLFTKELSRRLQ 
GSGVTVNALHPGVARTELGRHTGIHGSTFLQHHN \ WAHLLAAWS 
KS PRS WPAPAQHNTLAVAEELA\ VI SGKYFDGLKQKAPAPEAED 
EEVARRLWAESARLVGLEAPSVREQPLPR 


7105 


765 


143 


GQMCRRPSPKSTSCLSMTCDLP/RGLQDPQCLALFRVAVDKHQA 
LLKAAMSGQGVDRHLFALYI VSRFLHLQS PFLTQVHSEQWQLS r 
S Q I P VQQMHLFDVHN YPD YVS S GGGFG PADDHG YG VS Y I FMGDG 
M I TFH I S S KKS STKTDSHRLGQHIEDALLDVASLFQAGQHFKRR 
FRGSGKENSRHRCGFLSRQTGASKASMTSTDF 


7106 


14 


1064 


GLQAGH PHPRSASR I PEADTH \ YSKLQRAFDS IVNKDHKRMFGT 
YFRVGFFGSKFGDLDEQEFVYKEPAITKLPEISHRLEAFYGQCF 
GAEFVEVIKDSTPVDKTKLDPNKAYIQITFVEPYFDEYEMKDRV 
TYFEKNFNLRRFMYTTPFTLEGRPRGELHEQYRRNTVLTTMHAF 
PYIKTRISVIQKEEFVLTPIEVAIEDMKKKTLQLAVAINQEPPD 
AKMLQM VLQGSVGATVNQGPLEVAQVFLAE I PAD PKLYRHHNKL 
RLCFKEFIMRCGEAVEKNKRLITADQREYQQELKKNYNKLKENL 
RPMIERKIPELYKPIFRVESQKRDSFHRSSFRKCETQLSQGS 


7107 


1145 


591 


*I*WLQTGKKK 
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Amino acid segment containing signal peptide 
<A*Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *«Stop 
Codon, /=possible nucleotide deletion,, 
^possible nucleotide insertion) , 


7108 


1 


942 


VKVALLLTNLEQ PRTESEWENS FTLKMFLFQF VNI/NSS TF Y IAF 
FLGRFTGH PGAYLRL I NRWRLEE CHP SGCL I DL CMQMG 1 1 MVLK 
QT WNNFME LG YPLI QNW WTRRKVRQE HGPER KI S F PQWE KD YNL 
QPMNAYGLFDEYLEM I LQFG FTTI FVAAFPLAPLLALLNNI IE I 
RLDAYKFVTQWRRPLASRAKD IGIWYGILEGIGILSVI TNAFVI 
AITSDFIPRLVYAYKYGPCAGQGBAGQKCMVGYVNASLSVFRIS 
DFENRSEPESDGSEFSGTPLKYCRYRDYRDPPHSIjVPYGYTIjQF 
WHVLAW 


7109 


964 


102 


WDQRKRNSLVPGPAHGPAQEEPWEKKESLGAAQEALSIQLQPKE 
TQPFPKSEQVYLHFLSWTEDGPEPKDKGSLPQPPITEVESQVF 
SEKLATDTSTFEATSEGTLELQQRNPKAERLRWSPAQEESFRQM 
WIHKEIPTGKKDHECSECGKTFIYNSHLWHQRVHSGEKPYKC 
SDCGKTFKQS SNLGQHQR IHTGEKP FE CNECGKAFRWGAHLVQH 
QRIHSGEKPYECNECGKAFSQSSYLSQHRRIHSGEKPFICKSCG 
KAYGWCSELIRHRRVHARKEPSH 


7110 


96 


697 


RLDNFSGFLVEVTKEERHIVKPLYDRYRLVKQMLTRASITPVLG 
S P STKRRGQMLQ P 1 1 EGE TAHFFEE I KEEEE DG VNL S S ELGDML 
KTAVQVQSSLKNSESDVEENQEKLALDLRLSSSRAASMPELLEQ 
LWKARAEKKKLRKTLREFEEAFYQQNGRNAQKEDRVPVLEEYRE 
YKKIKAKLRLLEVLISKQDSSKSI 


7111 


2 


414 


GSGLYRGPTPGGQCIWKPNSMPPDHERNFGFTQFAIjELNELTAE 
LKRSLPSTDTRLRPDQRYLEEGNIQAAEAQKRRIEQLQRDRRKV 
MEENN I VHQ ARF FRRQTD S S GKEWWVTNNTY WRLRAE PG YGNMD 
GAVLW 


7112 


103 


495 


PRCFPVADRGRLIGGLPDWTIMEGKTLNLTCTVFGNPDPEVIW 
FKND QD I QLS EHFS VKVEQAKY VSMT I KGVTS EDSGKYS INI KN 
KYGGEKIDVTVSVYKHGEKIPDMAPPQQAKPKLIPASASAAGQ 


7113 


1 


824 


KCLRQAWHEAPSSIiAFTRWCSREERAEGGGNLHRSITRDPKPPG 
LRPSQRPMDDKKKKRSPKPCLAQPAQAPGTLRRVPVPTSHSGSL 
ALGLPHLPS P KQRAKFKRVGKJEKCRP VLAGGGSGSAGTPLQHS F 
LTEVTDVYEMEGGLLNLLNDFHSGRLQAFGKECSFEQLEHVREM 
QEKLARLHFSLDVCGEEEDDEEEEDGVTEGLPEEQKKIWADRNL 
DQLLSNLGSCLGALVPGGMRGGEGTYSQSHSWALGEKVGVHGSK 
SSGPLNLPRR 


7114 


3 


1492 


VWEVDEQIDHYKESQDKFLWQAAFIGKETIiKDESGQECKICRKI 
IYLNTDFVSVKQRLPKYYSWERCSKHHLNFU3QNRSYVRKKDDG 
CKAYWKVCLHYNLHKAQPAERFFDPNQRGKALHQKQAIiRKSQRS 
QTCEKLYKCTECGKVFIQKANLWHQRTHTGEKPYECCECAICAF 
SQKSTLIAHQRTHTGEKPYECSECX3KTFIQKSTLIKHQRTHTGE 
KPFVCDKCPKAFKSSYHLIRHEKTHIRQAFYKGIKCTTSSLIYQ 
RI HTS EKP QCS EHGKASDEK PS PTKHWRTHTKEN I YE CS KCG KS 
FRGKSHLSVHQRIHTGEKPYECSICGKTFSGKSHLSVHHRTHTG 
EKP YECRRCGKAFGEKSTIi 1 VHQRMHTGE KP YKCNE CG KAFS E K 
SPLIKHQRIHTGERPYECTDCKKAFSRKSTLIKHQRIHTGEKPY 
KCSECGKAFSVKSTLIVHHRTHTGEKPYECRDCGKAFSGKSTLI 
KHQRSHTGDKNL 


7115 


1 


947 


NAAHG YNYJGLW CM Y 1 1 PPQDWLDRGDESAP I RT PAM I G CS F WD 
RE YFGD I G LLD PGME VYGGENVKLGMRVWQCGGS ME VLPCS RVA 
HIERTRKP YNNDID YYAKRNALRAAEW7MDDFKSHVYMAWNI PM 
SNPGVDFGDVSERLALRQRLKCRS FKWYLENVYPEMRVYNNTLT 
YGEVRNSKASAYCLDQGAEDGDRAILYPCHGMSSQLVRYSADGL 
LQLGP LG S TAFLPDS KCLVDDGTGRMPTLKKCEDVARPTQRLWD 
FTQSGPIVSRATGRCLEVEMSKDANFGLRLWQRCSGQKWMIRN 
WIKHARH 


7116 


866 


95 


RVRMRRNAE VIEEKLSMKSWAKFRPGEPWKG YPN I DPETDP YVT 
PGSVINNLS I NTVREVDHLRDRNS GS S S SLNTTLPS TS AWS S IR 
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Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F-Phenylalanine, Q=Glycine / 
H=*Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arglnine, 
S=Serine f T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ASNYNVPLS S TAQS TS ARNSDS KLTWS PGS VTNTS LAHE L WKVP 
L P P KN I TAPS RPPPGLTGQKP P LSTWDNS PLR IGGGWGNSDAR Y 
TPG S SWG E S S S GR I TNWL VLKNLTP Q IDGSTLRTLCMQHG PL I T 
FHLNLPHGNALVRYSSKEE WKAQKSLHI SDLFLLTL 


7117 


695 


1261 


LLISTPGGCHPPPSSIEFTYTGAWGKALPAPHMPCAPGALPQGA 
FVSQAARAI PLLQPSQAAQAEGLSQPARACGALCSLPWPLRNWG 
SPILRLPGGLRTPTNDRKTRTRSAMACWARAQWDTLGPLKIiSHR 
GKVCLRHPRPTGVRGGPGAAGRQGGMGTRRRGTFTSGARDPGGL 
RVKHRCQPTGHLP 


7118 


49 


1863 


PHCEPNPGAGAMVLLKVLFEHAVGYALLALKEVEEISLLQPQVE 
E S VLNLG K FH S I VR LVAF CP FAS S Q VALEN AN AVS EG WHE DLR 
LLLE THLP S KKKKVLLGVGDPK I GAAI QE ELG YNCQTGG VI AE I 
LRGVRLHFHNLVKGLTDLSACKAQLGLGHSYSRAKVKFNVNRVD 
NMI IQS ISLLDQLDKDINTFSMRVREWYGYHFPELVKI INDNAT 
YCRLAQF I GNRRELNEDKLE KLEE LTMDGAKAKA I LDAS RS SMG 
MDISAIDLINIESFSSRWSLSEYRQSLHTYLRSKMSQVAPSLS 
AL I GE AVGARL I AHAGS LTNLAKY PAS TVQ I LG AEKALFRALKT 
RGN TP KYGL I FHSTFI GRAAAKNKGR I SR YLANKCS XAS RI DCF 
SE V PTS VFGEKLREQVEERL SF YE TGE I PR KNLDVMKE AMVQAE 
EAAAEITRKLEKQEKKRLKKEKKRLAALALASSENSSSTPEECE 
EMSEKPKKKKKQKPQEVPQENGMEDPSISFSKPKKKKSFSKEEL 
MSSDLEETAGSTSIPKRKKSTPKEETVNDPBEAGHRSGSKKKRK 
F S KE E P VS SG PEE AAG KS S S KKKKKFHKASQED 


7119 


49 


1863 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEVEEISLLQPQVE 
ES VLNLGKFHS I VRLVAFCPFASSQVALENANAVS EGWHEDLR 
LLLETHLPSKKKKVLLGVGDPKIGAAIQEELGYWCQTGGVIAEI 
LRG VRLHFHNLVKGLTDLS ACKAQLGLGHS YS RAKVKFNVNRVD 
NMIIQSISLLDQLDKDINTFSMRVREWYGYHFPELVKIINDNAT 
YCRLAQ FI GNRRE LNEDKLEKLEELTMDGAKAKA I LDAS RS S MG 
MD I S AI DL IN I E S FS SR WS LS E YRQ S LHTYLRS KMSQVAPSLS 
AL I GEAVGARL I AHAGS LTNLAKYPAS TVQI LGAE KAL FRALKT 
RGNTPKYGLI FHSTF IGRAAAKNKGR I SRYLANKCS I ASR I DCF 
SEVPTS VFGEKLREQVEERLS FYETGE I PRKNLDVMKEAMVQAE 
EAAAEITRKLEKQEKKRLKKEKKRLAALALASSENSSSTPEECE 
EMSEKPKKKKKQKPQEVPQENGMEDPSISFSKPKKKKSFSKEEL 
MSSDLEETAGStSIPKRKKSTPKEETVNDPEEAGHRSGSKKKRK 
FSKEEPVSSGPEEAAGKSSS KKKKKFHKASQED 


7120 


1991 


64 


QliGTRRCI^RGDKVTNAMQDFLVTNLEPRFIEPQTANLSVVFKDS 
NSTTPL I FVLS PGTDPAADL YKFAEEMKFS KKL SA I S LGQGQGP 
RAEAMMRSS IERGKWVFFQNCHLAPSWMPALERLI EHINPDKVH 
RDFRL WLTS L PSNKFP VS ILQNGS KMTI E P PRG VRANLLKS YS S 
LGEDFLNSCHKVMEFKSLLLSLCLFHGNALERRKFGPLGFNIPY 
EFTDGDLRICISQLKMFLDEYDDIPYKVLKYTAGEINYGGRVTD 
DWDRRCIMNILEDFYNPDVLSPEHSYSASGIYHQIPPTYDLHGY 
LSYIKSLPLNDMPEIFGLHDNANITFAQNETFALLGTIIQLQPK 
S SSAGS QGREE I VED VTQNI LLKVPE P INLQWVMAKYP VL YEES 
MNTVLVQEVIRYNRLLQVITQTLQDLLKALKGLWMSSQLELMA 
. ASLYNNTVPELWSAKAYPS LKPLSS WVMDLLQRLDFLQAW IQDG 
I PAVFWISGFFFPQAFLTGTLQNFARKFVISIDTIS FDFKVMFE 
APSELTQRPQVGCYIHGLFLEGARWDPEAFQLAESQPKELYTEM 
AVIWLLPTPNRKAQDQDFYLCPIYKTLTRAGTLSTTGHSTNYVI 
AVEIPTHQPQRHWIKRGVALICALDY 


7121 


2 


546 


R PLR PVTVLSLGS MVGLMT YGRRQFQS LDTTMRRL I P P FREASAK 
LTTLVDADAEAFTAYLEAMRLP KNT PEE KDRRTAALQEGLRRAV 
S VP LTLAETVAS LWPAI*QEIARCGNLACRSDI<QVAAKALEMGVF 
GAYFNVL INLRD I TDEAFKDQ IHHRVS S LLQEAKTQAALVLDCL 
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Amino acid segment containing signal peptide 
(A«Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T«= Threonine , V= Valine, 
W=Tryptophan, YaTyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








ETRQE 


7122 


2 


546 


rplrpwlslgsmvg£mtygrrqfqsldttmrrlippfreasak 
lttlvdadae aftayleamrlp kntpe ekdrrtaalqeg lrrav 
svpltlaetvaslwpalqelarcgnlacrsdlqvaakalemgvf 
gayfnvl inlrd i tdeafkdqihhrvs sllqeaktqaalvldcl 

ETRQE 


7123 


1 


1092 


KPAVPEARSAGTSEAGRSGAEEVSCGSVSGDGAAMRLTPRALCS 
AAQAAWRENFPLCGRDVARWFPGHMAKGLKKMQSSLKLVDCIIE 
VHDARIPLSGRNPLFQETLGLKPHLLVLNKMDLADLTEQQKIMQ 
HLEGEGLKNV I FTNCVKDENVKQ I I PMVTELIGRSHRYHRKENL 
E YCIMVIGVPNVGKSSL INS LRRQHLRKGKATRVGGE PG I TRAV 
MSKIQVSERPLMFLLDTPGVLAPRIESVETGLKLALCGTVLDHL 
VGEETMAD YLL YTLNKHQR FGYVQH YG LGSACDNVERVL KS VAV 
KLGKTQKVKVLTGTGNVNVIQPNYPAAARDFLQTFRRGLLGSVM 
LDLDVLRGHPRV 


7124 


2 


382 


LPLTLLLAAPFAHLLLPPGHDQSPCWHPGPALSPGTLGPLSWAM 
ANSGLQLLGYFLALGGWVGI I ASTALPQWKQSS YAGDAS IQLRS 
KVFVLESEWGGDSLGLPRDCGWSCLLHSAVRSEKGFWS 


7125 


166 


1127 


NCISEKRNYSFSMQKGKGRTSRIRRRKLCX5SSESRGVNESHKSE 
FIELRKWLKARKFQDSNLAPACFPGTGRGLMSQTSLQEGQMIIS 
L PE S CLLT \RDT VI RS YLG AY I TKW K P P PS P LLALCTFL VS E KH 
AGHRSLLEA\YLEILPKAYTCPVCLEPEWNLLPKSLKAKAEEQ 
RAHVQEFFAS SRDFFSS LQPLFAEAVDS I FSYSALLWAW CTVNT 
RAVYL\S PGSGNAFLQS RTPVQLAP YLDLLNHS PHVQVKAAFNE 
ETHS YE IRTTSRWRKHE EVF I CYGPHDNQRLFLE YGFVS VHNPH 
ACVYVSRGWNQLCS 


7126 


1 


733 


CRDMAAFI VPS PARRCS QKGS LGHLPTQPWLWAAMSPRGQERGT 
SHSQARE PQRPGRWLLGSLQS S PGTLGQAGTAS RRRGCMVQRWV 
Q VATGRRAVQVP KGALGLALGETS PGAS RGMSGGAGGCWALG WA 
PSPVLPSWLLEGPPPWLSIISDSGTQRPSPRRCPARPSPWGPQC 
WRGGR I AS AEAS ST* TPGS GS RARS GRRS PGS RRRS AS AP S PTP 
PTDACA+ SCVARPAGSRSSRPAAA 


7127 


1311 


277 


GLPAMCST*KAGYYEETEGDCIPKDR*IEKRPFKEI*RRIPRIF 
AKQKQI * S * NSQKIGASE I DRGRKEADCSDAPAAAR IGAVSVFR 
RS TQE AR VS PRSNAKS ANLRAVRAD * WEHF VLL FHTPEQ FLAE C 
ICRST**K*WHQLC*PLSSL*TGLKRKLLL*VLFRI*WLKDCDV 
* FOQKI FATNFCNWQNLIQ* EE * KPVEYS VEN* HIMNLLLPM* h 
CQSSLRDQTIVTWRM*RNYSMFRINMISSL*DGSIHIPLKLHFY 
PALIFTLTVPINSCCQRPLPLFAHQSIKTIiASSGSPMLACLRFL 
LVKKRAFIHTPRSPGCSV*CKHVLVKDNKNNCVGSEV 


7128 


2 


5228 


GRVDLWTILLGRSALRELSQIEAELNKHWRRLLEGLSYYKPPSP 
S S AE KVKANKD VAS PLKELGLRI SKFLGLDEEQSVQLLQCYLQE 
DYRGTRDSVKTVLQDERQSQALILKIADYYYEERTCILRCVLHL 
LTYFQDERHPYRVEYADCVDKLEKELVSKYRQQFEELYKTEAPT 
WETHGNLMTERQVSRWFVQCLREQSMLLEI I FLYYAYFEMAPSD 
LLVLTKMFKEQGFGSRQTNRHLVDETMDPFVDRIGYFSALILVE 
GMDI E SLH KCALDDRRE LHQFAQDGL I CQDMDCLMLTFGD I PHH 
APVLLAWALLRHTLNPEETSSVVRKIGGTAIQLNVFQYLTRLLQ 
SLASGGNDCTTSTACMCVYGLLS FVLTSLELHTLGNQQDI IDTA 
CEVLADPSLPELFV7GTEPTSGLGIILDSVCGMFPHLLSPLLQLL 
RALVSGKS TAKKVYS FLDKMS FYNEL YKHKPHDVISHEDGTLWR 
RQTPKLLYPLGGQTNLR I PQGTVGQVMLDDRAYLVRWE YS YS S W 
TLFTCE I EMLLHWSTADVIQHCQRVKP I IDLVHKVI STDLS I A 
DCLLP ITSR I YMIjLQRLTT VI S P P VD VI AS CVNCLT VLAARNPA 
KVWTDLRHTG PL P FVAH P VSS LS QM I SAEGMNAGG YGNLLMNS E 
QPQGEYGVTIAFLRLITTLVKGQLGSTQSQGLVPCVMFVLKEML 
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amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








P S YHKWRYNSHG VREQ I G CL I LEL IHAI LNLCHETD LHS SHT P S 
LQFLCICSLAYTEAGQTVINIMGIGVDTIDMVMAAQPRSDGAEG 
QGQGQLLIKTVKLAFSVTNNVIRLKPPSNWSPLEQALSQHGAH 
GNNLIAVLAKYIYHKHDPALPRLA1QLLKRLATVAPMSVYACLG 
NDAAAIRDAFLTRLQSK\IE\DMRIK\VM1L\EFLTVA\VETQP 
GLIELFLNLEVKDG\SDGSKEFSLGMW\SCLHAV/VWELIDSQQ 
QDRYWCPPLLHRAAIAFLHALWQDRRDSAMLVLRTKPKFWENLT 
S P L FGTLS PPSETSEPSI LETCAL IMKIICLEI Y YWKG SLDQP 
LKDTLKKFSIEKRFAYWSGYVKSLAVHVAETEGSSCTSLLEYQM 
LVSAWRMLLIIATTHADIMHLTDSWRRQLFLDVLDGTKALLLV 
PAS VNCLRLGSMKCTLLL I LLRQ WKRELGSVDE I LGPLTE I LEG 
VLQADQQLMEKTKAKVFSAFITVLQMKEMKVSDIPQYSQLVLNV 
CETLQE E V I AL FDQTRHS LALG SATE D KDSME TDDCS RS RHRDQ 
RDGVCVLGLHLAKELCEVDEDGDSWLQVTRRLPILPTLLTTLEV 
SLRMKQNLHFTEATLHLLLTIARTQQGATAVAGAGI TQS I CLPL 
LSVYQLSTNGTAQTPSASRKSLDAPSWPGVYRLSMSLMEQLLKT 
LRYNFLPEALD FVGVHQERTLQCLNAVRTVQS LACLEEADHTVG 
FILQLSNFMKEWHFHLPQLMRD IQVNLG YLCQACTS FLHSRKML 
QHYLQNKNGDGLPSAV\AQRV\QRPPSAASAAPSSSKQPAADTE 
ASEQQALHTVQYGLLKILSKTLAALRHFTPDVCQILLDQSLDLA 
EYNFLFALSFTTPTFDSEVAPSFGTLLATVNVALNMLGELDKKK 
E PLTQAVGLSTQAEGTRTLKSLLMFTMENCFYLL ISQAMRYLRD 
PAVHPRDKQRMKQELSSELSTLLSSLSRYFRRGAPSSPATGVLP 
SPQGKSTSLSKASPESQEPLIQLVQAFVRHMQR 


7129 


1 


1054 


FRRFRWRRRLH*AGPASSAGGSPGEASGTMSGELPPNINIKEPR 
WDQSTFIGRANHFFTVTDPRNILLTNEQLESARKIVHDYRQGIV 
P PGLTENEL WRAKY I YDSAFHPDTGEKMI LI GRMS AQVPMNMT I 
TGCMMTFYRTTPAVLFWQWINQSFNAWNYTNRSGDAPLTVNEL 
GTAYVSATTGAVATALGLNALTKHVS PL IGRFVPFAAVAAANC I 
NI PLMRQRE LKVG I PVTDENGNRLGESANAAKQAI TQVWSRI L 
MAAPGMAI P P F IMNTLEKKAFLKRFPWMS AP IQVGLVGFCLVFA 
TPLCCALFPQKSSMSVTSLEAELQAKIQESHPELRRVYFNKGL 


7130 


2 


780 


HEVPSLQTSDPLPGSVQRCSVWSQPNKENWCQDHLYNSLGRKG 
ISAKSQPYHRSQSSSSVLINKSMDSINYPSDVGKQQLLSLHRSS 
RCESHQDLLPDIADSHQQGTEKLSDLTLQDSQKVWVNRNLPLN 
AQIATQNYFSNFKETDGDEDDYVEIKSEEDESELELSHNRRRKS 
DSKFVDADFSDNVCSGNTLHSLNSPRTPKKPVNSKLGLSPYLTP 
YNDS DKLNDYL WRGPS PNQQN I VQS LREKFQCLS S S S FA 


7131 


805 


573 


AAAEGHIEWKFL I EACKVNPFAKDRWGNI PLDDAVQFNHLE W 
KLLQDYQDSYTLSETQAEAAAEALSKENLESMV 


7132 


1420 


1087 


I DMLLLSG ALVSG P YTL I TTAVS ADLGTHKS LKGNAHALS TVTA 
I IDGTGSVGAALGPLLAGLLSPSGWSNVFYMLMFADACALLFLI 
RLIHKELSCPGSATGDQVPFKEQ 


7133 


2 


3648 


QQI PGLLPAHGESGDALRKPRLQKPI TGHLDDLFFTLYPSLEKF 
EEELLELHVQDHFQEGCGPLDGGALEILERRLRVGVHNGLGFVQ 
RPQVWLVPEMDVALTRSAS FSRKWSSS KTSSGSQAL VLRSRL 
RLPEMVGH P AFAV I FQL E YVFS S PAGVDGNAAS VTS LSNLACMH 
MVRWAVWNPLLEADSGRVTLPLQGGIQPNPSHCLVYKVPSASMS 
SEEVKQVESGTLRFQFSLGSEEHLDAPTEPVSGPKVERRPSRKP 
PTS PS S PPAP VPRVLAAPQNS P VGPGLS I SQLAAS PRS PTQHCL 
ARPTSQLPHGSQASPAQAQEFPLEAGISHLEADLSQTSLVLETS 
IAEQLQELPFTPLHAPIWGTQTRSSAGQPSRASMVLLQSSGFP 
EILDANKQPAEAVSATEPVTFNPQKEESDCLQSNEMVLQFLAFS 
RVAQDCRGTSWPKTVYFTFQFYRFPPATTPRLQLVQLDEAGQPS 
SGALTH I LVPVSRDGTFDAGS PGFQLRYMVGPGFLKPGERRCFA 
RYLAVQTLQI D VWDGDS LLLIGSAAVQMKHLLRQGRPAVQASHE 
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SEQ 
ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








LEVVATEYEQDNMVVSGDMIjGFGRVKPIGVHSWKGRLHLTLAN 
VGHPCEQKVRGCSTLPPSRSRVISNDGASRFSGGSLLTTGSSRR 
KHVVC^QKLADVDSET J AAMLLTHARQGKGPQDVSRESDATRRRK 
LERMRSVRLQEAGGDLGRRGTSVLAQQSVRTQHLRDLQVIAAYR 
ERTKAES I ASLLS LAI TTEHTLHATLGVAEFFEFVLKNPHNTQH 
TVTVE IDNPELS VI VDS QEWRDFKGAAGLHTPVEEDMFHLRGSL 
APQLYLRPHETAHVP FKFQS FSAGQLAMVQAS PGLSNEKGMDAV 
SPWKSSAVPTKHAKVLFRASGGKPIAVLCLTVELQPHWDQVFR 
FYHPELSFLKKAIRLPPWHTFPGAPVGMLGEDPPVHVRCSDPNV 
ICETQNVGPGEPRDIFLWASGPSPEIKDFFVIJYSDRWLATPT 
QTWQVYLH S LQRVDVS CVAGQLTRLS L VLRGTQTVRKVRAFTSH 
PQELKTDPKGVF VLP PRGVQDLHVGVR PLRAGS R FVHLNLVDVD 
CHQLVAS WLVCLCCRQ PL I S KAFE I M LAAGEGKG VNKR I T YTNP 
YPSRRTFHLHSDHPELLR FREDS FQVGGGETYTIGLQFAPSQRV 
GEEEILIYINDHEDKNBEAFCVKVIYQ 


7134 


2115 


1111 


GGEGFSYPPHVGLSLGTPLDPHYVLLEVHYDNPTYEEGLIDNSG 
LRLFYTMDIRKYDAGVIEAGLWVSLFHTIPPGMPEFQSEGHCTL 
E CLE EALEAE KP SG I HVFAVLLHAHLAGRGI RLRHFRKG KEMKL 
LAYDDDFDFNFQEFQYLKEEQTILPGDNLITECRYNTKDRAEMT 
WGGLS TRS EMCLS YLLY YPR I NLTRCAS I PD I MEQLQ F I G VKE I 
YRPVTTWPF IIKSP KQYKNLS FMDAMNKFKWTKKEGLS FNKLVL 
SLPVNVRCSKTDNAEWSIQGMTALPPDIERPYKAEPLVCGTSSS 
SSLHRDFSINLLVCLLLLSCTLSTKSL 


7135 


2 


2072 


FVPRVTPRSLSLQGPKGESVGSITQPLPSSYLIFRAASESDGRC 
WLDALELALRCS S LLRLGTCKPGRDGE PGTS PDAS PSS LCGLPA 
SATVHPDQDLFPLNGSSLENDAFSDKSERENPEESDTETQDHSR 
KTESGSDQSETPGAPVRRGTTYVEQVQEELGELGEASQVETVSE 
ENKS LMWTLLKQLR PGMDLSRWLPT F VLEPRS FLNKL S D Y YYH 
ADLLSRAAVEEDAYSRMKLVLRWYLSGFYKKPKGIKKPYNPILG 
ETFRC CW FHPQTDSRTFYI AEQVSHHP P VSAFHVSNRKDG FCI S 
GS ITAKS R FYGNSL S ALLDG KATLTFLNRAED YTLTMP YAHCKG 
I L YGTMTLELGGKVT I ECAKNNFQ AQLE FKLKP F FGGS T S I NQ I 
S GK I TSGEE VLASLS GHWDRD VF I KE EG S GS S AL FWTP SGE VRR 
QRLRQHTVPLEEQTELESERLWQHVTRAISKGDQHRATQEKFAL 
EEAQRQRARERQES LMPWKPQLFHLDP I TQEWHYRYEDHS PWDP 
L KDI AQFEQDGI LRTLQQEAVARQTT FLGS PGPRHE RSGP DQRL 
RKASDQPSGHSQATESSGSTPESCPELSDEEQDGDFVPGGESPC 
PRCRKEARRLQALHEAILSIREAQQELHRHLSAMLSSTARAAQA 
PTPGLLQSPRSWFLLCVFLACQLFITJHILK 


7136 


2 


41B 


D FVPS FRRPSGNTS QTVWLLRAATLEKE VAGLREKI HHLDDMLK 
SQQRKVRQM I EQLQNS KAV I QS KDAT I QELKEKI AYLEAENLEM 
HDRMEHL I E KQ I SHGNFSTQARAKTENPGS I R I S KP PS P KPMP V 
IRWET 


7137 


2 


466 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GSFKVATQERNPQRAQMRLRRQKKGWPFLGDFLTELQRLDSAI 
PDD^GNTNKRSKEVRVLQEMQLLQVAAMNYRLRPLEKFV^YFT 
RMEQLSDKESYKLSCQLEPENP 


7138 


2 


466 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GS FKVATQERNPQRAQMRLRRQKKG WPFLGDFLTELQRLDS AI 
PDDLDGNTNKRS KE VR VLQEMQ LLQVAAMN YRLRP LEKFVT Y FT 
RMEQLSDKES YKLS CQLEPENP 


7139 


1 


357 


S LRNS ARGLKMAASAARGAAALRRS INQP VAFVRR I P WTAAS SQ 
LKEHFAQFGHVRRCILPFDKETGFHRGLGWVQFSSEEGLRNALQ 
QENHI IDGVKVQVHTRRPKLPQTSDDEKKDF 


7140 


1401 


1357 


RASS LQVLKAWGGL I PS S FQQQH TGQ YALEELFDLKVYDCFCS F 
NMNVSLEKQLRPSQPWPRGKCRKTPGWBEARPKAQDLRGDLGKT 
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ID 

NO: 


Predicted 
beginning 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QAGPAEAHTRGPPRLPAATGCPPHLPGLIiSGISVDIDPTGLQSQ 
WTPKGQDP PLM FS ED YQKS LLEQ YHLGLDQKLRKYWGELI WNF 
ADFMTNQCG 


7141 


124 


1073 


LDSRSCWLDMEDLEEDVRFIVDETLDFGGLSPSDSREEEDITVL 
VTPEKPLRRGLSHRSDPNAVAPAPQGVRLSLGPLSPEKLEEILD 
EANRLAAQLEQ CALQDRE S AG EGLG P RR VKP S PRRET FVL KDS P 
VRDLLPTVNSLTRSTPS/LKQPDASTPE*+*EGVSQGSPGYIWK 
EALQHEEGVTHLQSVPCIQKPSIFSS\SRSTPPVRGRAGPSGRA 
AAS EETRAAKLRG AAAKS S CQLP I P S AI PR PAS RMPLTSRS VP P 
GRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQRLNLPVM 
GATRSNLQPP 




ODD 


839 


LI FLMLHME LKMLSS VTLH IRAFLYW ICLKPTS CL I FQNVLNLL 
KK*SRAVG\ATWMCRT/YSSDLQVGVIKPWLLLGSQDAAHDLDT 
LKKNKVTHILNVAYGVENAFLSDFTYKS IS I LDLPETNILS YFP 
ECFEFIEEAKRKDGWLVHCNA 


7143 


3 


773 i 


SLEMSSDGEPLSRMDSEDSISSTIMDVDSTISSGRSTPAMMNGQ 
GSTTS S S KN I AYNCCWDQCQACFNS S PDLADH I RS I HVDGQRGG 
VFVCLWKGCKVYNTPSTSQSWLQRHMLTHSGDKPFKCWGGCNA 
SFASQGGIARHVPTHFSQQNSSKVSSQPKAKEESPSKAGMNKRR 

TfT.TTMVRPPQT.ZiDD'HTlWPnZinTT.nil TDTTP 3i TfCMT. C7MJ TDOT /*»trr« 
r^LJi\iH r\^i\r^ j lAnJ^c niJ V P Ur\\£ X. iJl/MX JxTlK/lXLtr iNlj^/Vrl-L EjOijVji\.w 

HSWFHSTVSILLFFQIKYKTLQKNISTIISKSLKI 


7144 


1 


988 


FRVNMQDGGPSPAEHSKAEESAGMEARFLGLPDAAGSSGPTPAR 
R CPAPR PAGVS YVI RDE VE K YWRNG VNALQLDPALNRL FTAGRD 
S 1 1 R I WS VNQHKQDP Y IASMEHHTD WVND I VLCCNGKTLI SAS S 
DTTVKVWNAHKGFCMSTLRTHKDYVKALAYAKDKELVASAGLDR 
QI FLWDVNTLTALTASNNTVTTS SLSGNKDS I YSLAMNQLGT 1 1 
VSGSTEKVLRVWDPRTCAKLMKLKGHTDNVKALLLNRDGTQCLS 
GSSDGTIRLWSLGQQRCIATYRVHDEGVWALQVNDAFTHVYSGG 
RDRKIYCTDLRNPDIRVLICE 



TRADOCS: 1 4 1 6260.1 (%CSK01 ! .DOC) 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO:l~1786 and 3573-5358, a mature protein coding portion 
of SEQ ID NO:l-1786 and 3573-5358, an active domain of SEQ ID NO:M786 and 
3573-5358, and complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1 . 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

1 0. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 



604 



WO 01/53312 



PCT/US00/34263 



(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent 
conditions with any one of SEQ ID NO: 1-1786 and 3573-5358. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

•13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in 

the sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

16. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 10 is identified. 

1 9. A method of producing the polypeptide of claim 1 0, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of a polynucleotide sequence of SEQ ID NO:l-1786 and 3573- 
5358, a mature protein coding portion of SEQ ID NO:l-1786 and 3573-5358, an active 
domain of SEQ ID NO:l-1786 and 3573-5358, complementary sequences thereof and a 
polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO: 1-1 786 
and 3573-5358, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 
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20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides SEQ ID NO:1787 -3572 and 5359-7144, 
the mature protein portion thereof, or the active domain thereof. 

2 1 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO:l-1786 and 3573-5358. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

27. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 
and a pharmaceutical^ acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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